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PREFACE 


This book is a treatment of some advanced topics in the theory of 
numbers. It was written to follow the author’s ^'Topics in Number 
Theory^ Volume //’ in which elementary number theory is presented. 

The level of mathematical maturity required for Volume II is 
much higher than for Volume I. Moreover, results obtained in 
Volume I are used freely, and in several of the chapters a knowledge of 
specific topics in various other branches of i^thematics is assumed. 
In particular, knowledge of the theory of symmetric polynomials, as 
well as the rule for multiplying determinants, is needed for the 
algebraic theory in Chapter 3, and the theory of analytic functions is 
used both in the theorem of Schneider in Chapter 5 and in the in- 
vestigation of the distribution of primes in Chapter 7. There seemed 
to be no point in assuming background unnecessarily, however, so I 
^ave included brief discussions of groups and matrices, on a very 
k^lementary level, in Chapter 1. 

The treatment of quadratic forms, admittedly shallow, has been 
based on the properties of the modular group for two reasons. In the 
first place, the geometric interpretation makes the usual definition of 
reduced forms seem quite natural, while no real insight is afforded by 
merely listing an unmotivated set of inequalities. In the second 
place, this treatment provides a simple illustration of the power of the 
theory associated with elliptic functions, wiiich is of considerable 
importance in modern number theory. Such methods are not often 
taught in American universities, and I hope that this treatment may 
serve to stimulate interest in them. 

To the best of my knowledge, the algebraic form of the Thue- 

Siegel-Roth theorem given in Chapter 4 has not previously appeared 
in print. 

^ W. J. L. 

Ann Arbor, Michigan 

November 1955 
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CHAPTER 1 


BINARY QUADRATIC FORMS 

1-1 Introduction. One of the subjects treated in elementary 
number theory is the possibility of representing a positive integer as 
a sum of two squares.* The expression 7 ? + which is of interest 
for this problem is a special case of\he general binary quadratic form 

2 /) = + ^^y + ( 1 ) 

(This in turn is a special case of the n-ary m-ic form, which is a 
homogeneous polynomial of degree m in n variables.) Systematic 
research in quadratic forms was begun by Gauss, and has since been 
extensively pursued. We shall not go very deeply into the subject, 
but prefer instead to develop general methods whose usefulness is not 
limited to the theory of quadratic forms, nor even to the theory of 
numbers. 

Suppose that in (1) we make the linear homogeneous substitution 


X = ax' + 

2/ = 7x' + hy', 



where a, 7 , and 5 are integers and D = ab — py 9 ^ 0 . Solving for 
x' and y gives 




so it is only in case D = zfc 1 that to each integer pair x, y corresponds 
an integer pair x', y' and conversely. We shall eventually suppose 

*See, for example, LeVeque, Topics in Number Theory ^ voL I, (Reading, 
Mass.: Addison-Wesley Publishing Company, Inc., 1956), Chapter 7. So 
much use will be made of the results obtained in this book that it will be 
referred to henceforth simply as Volume I. 
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that D = +1, for reasons that will appear later. Then 

X = bx — I3y, 

y' = —yx + ay. 

Substituting (2) into (1), we have a new form in and y\ 

y') = + B-i-'y' + Cy"^, 

where 

A = ao? + hay + cy^^ 

B = 2aap + b{ad + (3y) + 2c75, 

C = £1/3^ hl35 T" c6^. 




If for suitable integral values of x and ?/ we have/(.T, y) = n, then, for 
the eoi responding values of and iy determined by equation (4), 
g{x\ y) = n. 

It thus appears that, as far as questions of representation are con- 
cerned, it would be senseless duplication to consider /(.t, y) and 
l/) separately; every integer represented by / is also repre- 
sented by g, and conversely. This leads us to call / and g equivalent, 
and to write / ~ g, if one can be obtained from the other by a uni- 
modular linear substitution with integral coefficients, 


X = ax + jS?/', 

y = yx -\- 5?/', 




Tr is in t\irn brings up one of the principal problems of this chapter: 
ho’v ' .ecide whether two given forms are equivalent. 

substitution (2) is described quite adequately by specifying 
dents or, /3, y, and 6; that is, by writing the matrix 

(: :)• 

ibei does not represent a number, of course; it is simply a 
fVe coeincients of the substitution, in the order in which they 
yC£ in {2 1 However, we can give names to these matrices, and 
deauce certain of their properties from the corresponding properties 
of the associated substitutions. Thus if 
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then we shall say that M and M' are equal if and only if they cor- 
respond to the same substitution, that is, 

a = y = y\ 8 = d'. 

If for arbitrary M and M' we apply the corresponding substitutions 
successively, so that 

X == ax' + Py'j X = ax' + 

y = yx ^ hy , ^ y = y'x" + h'y" ^ 

we could accomplish the same tlSpg by the single substitution 

X = (aa + Py')x" + (a/3' + /35')?/", 

y = (ya' + 8y')x" + ( 7 / 3 ' + 88')y". 

Thus, if by the 'product MM' of two matrices we mean the matrix of 
this latter substitution, we must define 

/a A /a' /3'\ ^ / aa' + py' afi' + P8'\ 

\y 8/ \ 7 ' 8'/ \ 7 a' + 8y' yfi' -f 88' ) 


Thus the groduct has as element in the tth row and jth column, for 
each i and i, the sum of the products of the elements of the zth row of 
the first matrix with the corresponding elements of the 7 th column of 
the second matrix. Moreover, if the determiuant of a matrix is defined 
as 

it requires only a routine calculation to show that *b 03 .. 



det M • det M' = det (MM'). 

' • )rl( 

It is to be noticed that, in general, MM' 9 ^ M'M, although 

MiM'M”) = {MM')M". 




i . 


Since the substitutions given by (2) and (3) are inverse, r . 
other, it is natural to call the matrix of ( 3 ) the inverptr of the Uj 
of (2), and to designate it by M“^. Then MM“^ M 
where . i" 

^ = (J 5)- • (7) 

I is called the identity matrix; it corresponds to the trivial substitu- 
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tion X = x\ y = y\ and has the property that MI = IM = M for 
every M. A square matrix has an inverse if and only if its determinant 

is different from zero. _ 

Finally, we designate by M the trans'pose of M, obtained by inter- 
changing rows and columns in M : 

The transpose of a product is the product of the transposes, in reverse 
order : _ _ 

(mW) = M'M. 

Also, the transpose of the inverse is equal to the inverse of the 
transpose : 

(M)-! = 



Matrices need not be square. Thus 

if X = {xy), then X = 

note, however, that nonsquare matrices have neither dfterminants 

nor inverses. , 

The importance of this algebra of matrices to our present purpose 

lies in the fact that if 


then 


XFX = {xy) 


% 


= (ax^ + hxy + q/^)- 

Although it is a slight abuse of language, it is convenient and m the 
present context harmless to identify a one-by-one matrix with the 

element itself, so we write 

/(x, y) = XFX. 

F is called the matrix of the form, and A = 4 ■ det F = 4ac - is 

called the discriminant of the form. 

In terms of matrices, the substitution equations (2) and (3) can 

be written as 





1 - 1 ] 


INTRODUCTION 


5 


X = X'M and X' - XM-\ 
respectively. Thus 

f(x,y) = XFX = {X'M)F{X'M) = {X'M)F{MX') 

= X'(iUFM)X', 

so that the matrix of gf is G = MFM, (The reader might test his 
ability to manipulate matrices by showing that the last equation is in 
agreement with equations (5)). Multiplying both sides of the equa- 
tion G = MFM by M~^ on the left and on the right, we have 

= M~HMFM)M-^ = (M-'^M)F(MM-'^) = F. 

If det M = 1, then also det M = 1, and 

det G = det (MFM) = det M • det F • det M = det F, 

so that the discriminant of a form is not changed by a unimodular 
substitution. 

In summary, a form with matrix F is equivalent to a form with 
matrix G i^nd only if there is a matrix M such that G — MFM and 
det M = ; equivalent forms have the same discriminant and 

represent the same integers. 

The relation of “equivalence,” as used here, is an equivalence rela- 
tion in the technical sense.* For it is clear that 

(a) /-'/: F = IFI; _ 

(b) f ^ g implies g f: G = MFM implies F = M~^GM~^^ ; 

(c) f ^g^ndg implies / r^h: G = MFM and H = M'GM' 

implies// = M'MFMM' = (MM')F{MM'). 

Thus all the forms equivalent to a given one are equivalent to each 
other, and the set of all forms splits up into equivalence classes, any 
two elements in one class being equivalent, and elements from differ- 
ent classes being inequivalent. (The equivalence classes for the rela- 
tion of congruence (mod m) are simply the residue classes modulo m.) 
Just as we chose a system of representatives of the various residue 
classes modulo w, we would like to pick a system of representative 
forms, one from each class. It is the object of the next two sections to 
develop machinery by which such reduced forms can be obtained in a 
natural way. 


*See, for example, v olume I, Section 3-1. 
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primitive root of g is a generator of M(q); thus Theorem 4-11 is a 
statement of the fact that M{q) is cyclic (consists of the powers of 
a single element) if and only if g = 1, 2, 4, If a > 2, M(2“) 

has two geneiators, 1 and 5j for example, modulo 16, the pow'ers of 
5 are 5, 9, 13, and 1, and these numbers, together wuth their negatives, 
form a reduced residue system. 

At present w'e shall do no more w4th finite groups, but turn our 
attention instead to the much more complicated multiplicative group 
of all two-by-two matrices with integral entries and unit determinants. 
This infinite group, which will be designated here by F, is called the 
modular group. To show that T is a group, we verify properties (a) 
through (d) above. The system is obviously closed under multiplica- 
tion, since the determinant of a product is the product of the determi- 
nants of the factors. The associative property has already been 
verified. The identity element of T is /, as defined in (7). The in- 
verse of any element 






f 



The group T differs from the other examples mentioned in that it is 
noncommutative, since in general MM' ^ AI'M. (Abstractly, G is 
said to be a commutative or abelian group ifao6 = boa for every a 
and b in G.) 


1-3 The modular group. The properties of F could all be de- 
veloped by the use of algebra alone ; we prefer instead to build up the 
theory with the help of a simple geometric interpretation. It is now 
convenient to reverse the roles of the accented and unaccented 
variables in the equations (2) ; this new notation will be used through- 
out the discussion of the modular group, but the original system will 
be reverted to when quadratic forms are again considered. To keep 
matters straight, (2) will be termed a substitution, while the modified 
equations will be called a transformation. Putting z = x/y and 



THE MODULAR GROUP 


9 


1 - 3 ) 

z' = x' /y' , we get 


So far notliing essential has been accomplished. The crucial point 
lies in allowing 2 to range over all complex numbers, rather than the 
real rationals to which it was formerly restricted. Then equation ( 8 ) 
can be regarded as defining a transformation or mapping of the com- 
plex 2-plane into the 2'-plane. Somewhat more than this can be said • 

if 



is in r, so that det M = 1 , a simple calculation shows that tJie imagi- 
nary parts of 2 and z' have the same sign. In other words, ( 8 ) maps 
the upper half of the 2-plane (i.e., the region where the imaginary part 
of 2 is positive) into the upper half of the 2'-plane, and the lower half 

into the lower half. Hereafter, we restrict attention to the upper half 
planes. 

It is convenient to identify the 2- and 2'-planes, and to think of ( 8 ) 

as sending each point 2 of the upper half U of the complex plane into 

another point 2' of U. We also identify the elements of r with the 

corresponding transformations ( 8 ), which has the effect of identifying 
the matrices 

(: 0 • 

In accordance with the earlier definition of equivalence, two points 
z and z' of U will be called equivalent if one can be mapped into the 
other by a transformation of F. As usual, this assigns each point of U 
to an equivalence class ; two elements of the same class are equivalent, 
and elements from different classes are inequivalent. A region R oi U 
is called a fundamental region if no two of its points are equivalent, 
while every point of U is equivalent to a point of R; in other words, 
R constitutes a complete system of representatives of the above equiv- 
alence classes. It would be more precise to refer to as a funda- 
mental region of the group F, since two points may be equivalent with 
respect to one group of transformations but not with respect to 
another. For example, it is clear that a fundamental region R' oi a 
subgroup F' of F contains a fundamental region of F itself, if both 

"9 
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regions exist. For any point in [/, being equivalent to some point of 
R' under the transformations of r', is a fortiori equivalent to the same 
point of R' under the transformations of the larger group T. It may 
not be true, however, that any two points of R' are inequivalent with 
respect to F. 


Theorem 1-1, The region R in U composed of all points z stick that 
— i ^ Kez < ^ and either \z\ > 1, or else \z\ — 1 and — f < 
Re 2 < 0, is a fundamental region of F. (See Fig. 1-1.) 

Proof: First note that F has the subgroup Fq of all integral 


translations 2 ^ = 2 + )3. For the 
associated matrix 

CO 

has determinant 1, the identity 
transformation z — z is in Fq, 
the inverse of 2 ^ = 2 + ^ is 2 = 
2 — and is in Fq, and the result 
of making two translations is 
again a translation, Fo is cyclic, 
being generated by 

2' = 2 + 1. (9) 



As a fundamental region of Fq we could choose any infinite strip in JJ 
of unit width, extending parallel to the imaginary axis from the real 
axis. We take the following one: 

Im2 > 0, — ^ < Re2 < ^. 


From the remark preceding the theorem, /?o niust contain a funda- 
mental region of F if any exists, R^ is not itself a fundamental region 
of r, however, for the point z/2 of R^ is transformed into the point 2^ 

of i2o by fbe transformation 

2' = (10) 

2 


With each transformation 

T: z' 



ocZ P 

yz 
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with y ^ 0, there is associated the circle C(T): jyz + 5| = with 
center at —S/y and radius l/\y\ < 1. Now 


/ a2 4* /3 ffy — 0:6 1 

yZ —0 = 7 ; O = , 

72 + 6 72 + 6 72 + 6 


so that C(T) is transformed by T into \yz — a\ = 1, which, by (3), is 
C{T~^). More importantly, the exterior of C(T) goes into the 
interior of C It is simple to deduce from this that no two points 
of the region R described in the theorem are equivalent. Certainly no 
point of R is mapped into another by an element of Tq. But if T is 
not in To, then 7 0, and since the interior of R is external to all the 

circles C(T) (inasmuch as they all have radii < 1 and are centered at 
real points), any interior point of R is mapped by T into an interior 
point of one of these circles, and hence into a point outside R. 

The arc +: \z\ = 1, — ^ < Re 2 < 0, which forms part of the 
boundary of /?, is also completely exterior to all the circles C{T) ex- 
cept \z\ = 1 and I 2 + 1| = 1. The circle \z\ = 1 is associated with 
transformations 

, a2 + 

2 = i 

2 


and since the determinant must be 1, = —1 and 



If 2 is a point of A, \z* — o;| = 1, and so z is not in R unless a = 0 or 
— 1. If a = 0 we have the transformation (10), which sends A onto 
thearcl 2 l = 1, 0 < Re 2 < this arc has only the point z in common 
with /?, and i goes into itself. (This means that i is equivalent to 
itself in two different ways: z' z and 2 ' = — 1 / 2 .) If a = — 1, + 

goes into the arc I 2 + 1| = 1» ~I < Re 2 < — these two arcs have 

just p = — ^ -j. f'v/ 3/2 in common^ and p goes into itself. 

The circle I 2 -f- 1 1 = 1 is associated with transformations 


i.' - «2 + (g — 1) 


1 

2 + r 


2 + 1 


2+ 1 


= a — 
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Iz' — (a — 1)1 = I 2 ' — Oil, 

Re z' = a ~ 

and z' is not in R unless a = 0 . Under the transformation 



1 

2+1 ' 


the arc A goes into the line segment Re 2 = — ^ < Ini 2 < V^/ 2 ; 

the arc and the segment have just p in common, and p goes into itself. 
We have thus shown that no two points of R are equivalent, and have 
incidentally obtained the following result, which will be useful later. 

Theorem 1-2. The point p = (-1 + zV3)/2 is mapped into 

itself by the three transformations 

/ ^ ' 2+1 

= 9 - 5^ = ana z — — » 


and by no others, 
formations 


The point i is mapped into itself by the two trans- 

, 1 

z ~ z and z = > 


and by no others. Any point of R different from p and i is mapped 
into itself only by the identity transformation z = z. 

To complete the proof of Theorem 1 - 1 , we must show that any 
point 2 in U is the image of a point in R under a transformation of T. 
We do this by finding a finite sequence of transformations such that if 
they are successively applied to 2, the final point 2 is in R. Then the 
inverse of the product of these transformations maps 2' back into 2. 

Designate by S the generator ( 9 ) of To, and by W the transforma- 
tion ( 10 ). Let 2 be a point of U not in R. Then for some integer rij, 
which may be positive, negative, or zero, zi = ^”*2 = 2 + nx is in Rq, 
the fundamental domain of Tq. If 21 is in R, we are finished. If 
\zi\ = 1 but 0 < Re2 < then Wzi is in R. If |2i| < 1 , then 
^2 = Wzi has modulus greater than 1 . In fact, if 21 = xi + iyij 
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-^1-1- iy\ 

+ yi" 


^2 -I- iy2, 






so that^Imza > Imzi, and if yi < i then Im 22 > 2 Im 21, since 

then xi -I- 2/1^ < i if 23 is in R, we are through. If not, there is a 

suitable exponent such that 23 = S’'222 is in /?o, and Im 23 = Im 22. 

If 23 is not in R, we can apply ir again, and get 24 = ir,S"2lp,S"i2. 

What we must show is that after finitely many steps, this process 
leads to a point in R. 

As long as yk < f, we will have ?/*+, > 2 yk, if 2*4,1 = W2*. Start- 
ing with a positive number (the imaginary part of 2), a finite number 
of doublings will produce a number larger than f So suppose that 
we have obtained a 2* = x* -f- fj/* such that 

-h<Xk<h, yk>\, x,?->ty^<\. ( 11 ) 

1 hen 

, _ 1 1 

^*+1 “ ’ 2*4_2 f- n, 

Zjc 

where n is so determined that - i < Xk+2 < h- This gives 


SO that 


^ nxk ~ 1 + inyk 

Zk^2 ; — : 

+ iVk 


If In] > 2 , 



{nxk — 1 )^ + 

+ Vk^ 


\Zk+ 2 ? > = 1, 

while if |n| = 1, the hypothetical inequality \zk+2? < 1 gives 

(x* - nf -I- y^^ < Xk^ + yk^, 

origin than from the point 
n \yhich is false from the first inequality of (11). Finally if 

— i < n ~ l/l2*l^ > 1 . Hence in all cases, 12*4,2! > 1, and 

I ^ "j- ^ 2- If 2t+2 is still not in R (which may happen if 

^2*4.2 is in R, and the proof is complete, 
loreover, the proof has shown that S and W are generators of T 
mce every transformation of r can be written in the form 

<S*"tTF . . . IF 52 " 2 lF 5 r>. 
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Figuhk 1-2 

A geometric representation of the grmip r is gi\en in Fig. 1-2. 
Here \vc have considered the region R as flie image of itself under the 
identity transformation I, and have put R = R{I). The congruent 
unshaded region to the left of R is then R(S~^)j in the sense that if a 
point z' of it is equivalent to a point 2 of R, then 2 ' = To put it 

differently, S'~^ maps R onto this region, just as ]V maps R onto the 
unshaded region /f (IF) immediately below R- The semicircular arcs 
are portions of the circles C{T ) ; infinitely many of them terminate in 
each rational point on the real axis. ^ If the drawing and shading were 
completed, any shaded or unshaded region could be taken as a funda- 
mental region. Each fundamental region or “double triangle" is 
bounded by three arcs, with vertex angles of 0, 7r/3, and tt/'S. The 
heavy arc inside each region indicates the portion of the boundary 
which is to be included in the region. 

PROBLEMS 

1. FinH the point in R to which the point 

_ 3 + 2i 

8 + 6t 

is equivalent, by the method used in the proof of Theorem 1-1. Do you 
see an easier way, for this particular number? 

2. If the term ‘"circle” is used in the broad sense to include straight lines, 
show that the transformations of F send circles into circles. Under what 
circumstances are the image circles actually lines? W hat can be said about 
such a line if, for every point z on the original circle, Im 2 > 0? 




¥ 


1^] 


RKDUCED DEFINITE FORMS 

3, Verify that, in the notation of the text, (5IF) 

4. Show that the transformations 


15 


3 _ 


= /. 


P: 2' = 1 - 


2, 


Q: 


z = 


1 




Im z > 0, 


z\ > 1, 


and 


2 - 1 | > 1 , 


boundary of this region, leading from (1 + tV3)/2 
(Note that the .h, "“1* 

1-4 Reduced definite forms. With the help of the facts now 

We m 5 quadratic forms are equivalent 

the discriminant A‘ = 4ac - 6^ of the form ax^ + bxy + Z^t 

r. 0) SaTo^' ^ »hioK 

definl f Otherwise indefinite. The 

forms can be further cl.ssiBed as po^itue or negaJe, according 

*eri, sXt thtftm ' ^ *■“ 


y 


^ 1 “ © 


z 


+ 5 “ + c 

y 


whill^^ same sign as a for every choice of z and y except z = y = 0 
consider defil ^ f first 

since 1 ® restricting our attention to positive forms, 

Sin*^ +n negative forms is almost identical, 

symbol in of a form is a little cumbersome, we shall use the 

clearlv nnri!’ f designate the form + hxy + cy^. it is to be 

combhied witMuf abbreviation, and cannot be 

Let • syiribols as matrices can. 

which ”®^°"®*der then a positive definite form /(a;, y) = [a, b, c], in 

quire th f ’h ^ c > 0. For the time being, we do not re- 

at a, b, and c be integers. Then the quadratic polynomial 
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— 6 zh V— A 





of these, we single out the one with positive imaginary part and call 
it 0). Thus to the form [a, b, c| there corresponds the point w in the 
upper half plane. Coiu'ersely, each point in U corresponds to exactly 
one form of discriminant A. For if Zo is such a point, and lo is its 
complex conjugate, then there is a uni(iue number x such that the 
quadratic expression h{z — Zo)(z — zo) has discriminant A. Hence if 
we consider only forms of given discriminant A (which is all that is 
re(iuired in the eciuivalence problem, since equivalent forms have the 
same discriminant), there is a one-to-one correspondence between 
points of U and forms of thf^t discriminant. Moreover, if the points 
0)1 and C02 are associated with the forms /i and /2 of discriminant A, 
and if a transformation T of F carries /i into/2, then it carries wj into 
0)2- It therefore makes no difference whether one speaks of the form/ 
or the point oj, as far as the operations of F are concerned. We call co 

the representative of /. 

It should now be clear how to decide whether or not two forms are 
equivalent. If they do not have the same discriminant, they are not 
equivalent. If they have, they are equivalent if and only if their 
representatives are equivalent, and this can be decided by trans- 
forming the representatives into the fundamental region R, where 

they must be identical to be equivalent. This leads us to define a 
reduced form as one whose representative is in R; reduced forms are 
equivalent if and only if they are identical, and each class of equivalent 

forms contains exactly one reduced form. 


Since 


O) 


— 6 -h V— A 
2 ^^ 





w is in R if and only if 2 a < i and either c/a > 1 , or 

c/a = 1 and -^ < -b/a < 0 . Simplifying, Ave have that [a, b, c] 

is reduced if and only if either 

~a < b < a < c 


or 


0 < h < a = c. 


( 13 ) 
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PROBLEM 

Prove the assertion, made in the text, that if toi and £02 are the repre- 
sentatives of the forms/i and/2 with discriminant A, and if a T" in P carries 
/i into/2, it carries coi into C02. 


1-5 Reduction of definite forms. A given form can be trans- 
formed into its equivalent reduced form by exactly the process used 
in the proof that R is a, fundamental region of P. That is, by a trans- 
lation S"i, 03 can be changed into 03', where — ^ < Re oj' < if to' 

is not in R, we begin afresh with UW, etc. The translation 2' = 
z rii must be such that 


or 



h 

2a 


+ Ui 



b = 2ani + r,. 


where —a < ri < a. 


The transformation z' == z ni has matrix 



but we must now revert to the inverse transformation z = z' - m to 

utilize the results of Section 1 - 1 , which were based on the equations 
( 2 ). If we put 

then, as we saw earlier, M carries a form with matrix F into one with 
matrix 

G = MfM, 


so that in tins case, if we let the result of the first translation be 
/i (x,y) = XFiX, then 


^ G T) ■ 

Similarly , if F2 is the result of applying the inversion W to Fi then 
A simple calculation shows that, if/j = [a^ c], then/g = [c, -b, a]. 
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Thus we have the following algorithm for reducing / = [a, b, c] : find 
ni and Vi such that 

b = 2ani — ri, —a < rr < a, 
and compute /i = [ai, 6i, Ci], where 

-. = (- 1 . 7 ‘)' 

SO that/i = [a, 6 - 2 ani, ni^a - brii + c]. If /i is not reduced, put 
/g = [ci, — 6 i, ai] = [a2, ^2, C2]. If/2 is not reduced, repeat the entire 

procedure. For some kj fk will be reduced. 

The discussion thus far has been valid for positive definite forms 
with arbitrary real coefficients. For the remainder of this section and 
the next, we consider only integral forms, that is, those with integral 

coefficients. 

Theorem 1-3. There are only finitely many classes of integral 
definite forms of given discriminant 

Proof: To each class there belongs just one reduced form [a, b, c] 
satisfying the conditions (13). Since 

4a^ < 4ac = A + 6^ < A + a^, 

the inequality 0 < a < V ^/3 holds for each reduced form, so that 
there are only finitely many possible values of a for fixed A. Since 
16| < a, the same is true of 6, and for each pair o, b there is at most 

one integer c such that 4ac b = A. 

If, for example, A = 3, then 0 < a < 1, so that a = 1 and hence 

b = 0 or 1 ; from this it is easily seen that the only integral reduced 

form of discriminant 3isx^ + xy + There js also just one class of 

discriminant 4, and its reduced form is x^ + 2/^. 

PROBLEMS 

1. Find all reduced integral definite forms of discriminant A < 20. 

2. Find the reduced form equivalent to [117, 103, 100]. 

1-6 Representations by definite forms. If a transformation of F 
leaves a quadratic form unchanged, it is called an automorph of the 
form. Since an automorph also leaves the representative of the form 
unchanged, and is the only kind of transformation which does, the 
following theorem is an easy consequence of Theorem 1-2. 
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Theorem 1-4. The only autojnorphs of a(x^ + y^) are 


X = ztx , 

y = ± 2 /', 


and 


X = ^y , 
y = =fx. 


The only automorphs of a{x^ + xy y^) are 

X = 

\y = zbx' rh y\ 


X = zbx', 

[y = zhy\ 


and 


X = dzx' 

y = =Fx'. 


y 


Any 'positive reduced form distinct from these two has only the auto- 
morphs 

X = 

[y = ± 2 /'. 

An integer n is said to be properly representable by an integral form 
[a, b, c] of discriminant A if there are relatively prime integers a, y 

such that aa^ + bay + cy^, = n. For such a, y, there are do and’ do 
such that aSo — /3o7 = 1, and, in fact, 


a6 — dy = 1, 


if, for some integer L 


P ~ Po oct, 

^ ~ ^0 H" yt* 

If we make the substitution 

x = ax' + 

y = yx' 5y', 


(14) 


then [a, 6, c] goes into a form [n, m, Z] with first coefficient n, by 
equations (5). Also by (5), 

m = 2aa(do + at) + biaSo + ayt + do 7 + ayt) + 2cy(So + yt) 

= 2aado + b(a5o + do7) + 2cySo + 2nt, 

so that OT is determined modulo 2n. Choose m so that 0 < m < 2n • 

then t IS fixed, d and S are unique, and Z is determined by the dis- 
criminant; 

4Zn — = A. 

.nleiers 0 J sue/, ,/u,, aS - gy . j, (S) 
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replaces [a, 6, c] by the equivalent form [n, m, Z], where 0 < m < 2n, 
m satisfies the congruence 

= — A (inod4n), (15) 


and 



+ A 
4n 



Thus to each proper representation of n by [a, 6, c] there corresponds 
a unique form which has first coefficient n and satisfies certain auxil- 
iary conditions. The appropriate converse, which we now consider, 
gives the number of such representations, and provides an effective 
method of finding them. If m is a solution of (15) and 0 < m < 2n, 
then 4n — m is also a root, and 2n < 4n — m < 4n. We shall refer 
to m as a minimum root if 0 < < 2n. 

Theorem 1-6. Lei w{f) he the number of automorphs off= [a, c], 

an integral positive form of discriminant A. Let nbea positive integer. 
Corresponding to each minimum root m of the congruence (15), 
determine I by equation (16). Then the number of proper representa- 
tions of n by f is w{f) times the number of such forms [w, w, 1] which 
are equivalent to f. In particular, if there is only one class of dis- 
criminant A, the number of proper representations is w{f) times the 
number of minimum roots of (15). 

Proof: Suppose that g = [n, m, 1] is a form of the type described in 
the theorem. Then if / is not equivalent to g, Theorem 1-5 shows 
that there is no representation of n by / corresponding to the minimum 
root m. If / is equivalent to g, let T be the matrix of a substitution 
which replaces / by g^ and let A be the matrix of an automorph of f. 

Then _ _ 

G = TFT and F = AFA, 

so 

(AT)F(AT) = TAFAT = TFT = (?, 

so that AT is also the matrix of a substitution which carries / into g. 
Conversely, if for any U , 

G = VfU, 

then UFU = TFT, and 


F = T-^UFUT'^, 
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AT. 


If 





and/ has only two automorphs (see Theorem 1 - 4 ), then 


AT 


= (: a) 


or 




and ot, y and -a, -y give two distinct proper representations, since 
(a, 7 ) = 1 and therefore a and 7 are not both zero. If / ~ a + y^), 
then 


AT 



or 


or 




or 



and the representations a, 7; ~0!, — 7; —7, a; and 7, —or are again 
distinct. If / a(x^ + xy + then AT is one of the matrices 


a 





or 



+ 7 


and these also lead to distinct representations. 

If there is only one class of discriminant A, then / and g are neces- 
sarily equivalent, so that all minimum roots of (15) lead to repre- 
sentations. The proof is complete. 

In the case of primitive forms (those having relatively prime 

coefficients), w(f) depends only on A: ia(/) = 6 , 4, or 2 according 

as A is 3, 4, or larger than 4. If /(x, y) = so that A = 4, 

then m must be even to satisfy ( 15 ). Let m = 2mi; then = 

— 1 (modn), and 0 < m < 2 n means 0 < mi < n, so that the 

number of proper representations of n as a sum of two squares is four 

times the number of solutions of the congruence = -l (mod n). 

This result was obtained in Theorem 7-5, Volume I, by quite different 
methods. 

problems 

tinn f 1 ^ corresponding to the proper representa- 

tion 3, 5 of 118 by [2, -5, 7]. 
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2. What is the number of proper representations of 28 by [1, 1, 2]? 
Find them. 

3. Use Theorem 1-6 to discuss the proper representability of 10 by 

[2, 1, 2]. 

4. Show that every prime congruent to 1 or 3 (mod 8) has a unique 
proper representation in the form + 2^- with :r > 0, ?/ > 0. More 
generally, show that if n is the product of powers of r such, primes, then n 
has 2’"^^ proper representations in this form. 

1-7 Indefinite forms. The behavior of indefinite binary forms is 
remarkably different from that of forms with positive discriminant. 
For example, any integral indefinite form whose discriminant is not 
the negative of a square has infinitely many automorphs, and there- 
fore represents any integer in infinitely many ways if it represents it 
at all. Moreover, there seems to be no natural way to pick out a 
unique reduced form in each equivalence class, although we shall find 
a finite set of canonical forms in the case of integral forms. 

Hereafter we restrict attention to integral forms [a, h, c], and put 

D = — A = — 4ac > 0. 

If Z) is a square, then [a, b, c] factors into two linear factors with 
integral coefficients. We dismiss this degenerate case, and hereafter 
require that D be a nonsquare integer. Finally, for the sake of 
simplicity we consider only the case that [a, h, c] is primitive. We see 
from equations (5) (proof by contradiction) that any form [A, B, C] 
equivalent to a primitive form is again primitive. 

As before, there is associated with [a, h, c] the quadratic equation 

az^ + bz + c — Oi 

which this time has two real roots, say 

-b + VD -b - Vd 

It is easily verified that a transformation of the modular group which 
sends [a, 6, c] into [a', b\ c'] sends wj into wi' and coa into wg', and 
never coi into 0 ) 2 '. We call the root, and ojz the second root. 

As C. Hermite noticed, there is also associated with 

[a, b, c] = a{x — o}iy)(x — o> 2 y) 
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~ '21 ^ (.r — ^2*/)^, 

M’here t > 0 is a real parameter. A simple calculation shows that the 

discriminant of <f>i{x,y) is D, for every t > 0 . Reverting to the 

quotient variable 2 = x/y, we find the zeros of ipi{z) to be those 
points 2( such that 

- {zi — (oi)^ = —i(zi — 0)2)^ 

or 

Zt — oil = ±it{zt — C02). 

The transformation z' = iz rotates the plane about the origin through 

the angle 7 r/ 2 ; it follows from the last equation that the line segment 

connecting 2, with coi is perpendicular to the segment connecting 2, 

with co2> and hence that Zt lies on the circle having as diameter the 

segment which connects wi and o>2. If, as usual, we take that root 2, 

which has positive imaginary part as the representative of <pi, then we 

have associated with [a, b, c] the semicircle 2 in f/ connecting oji 

and 102. As t varies from 0 + to 00 , 2, describes 2 from coj to 102 ; we 

can think of the semicircle as oriented with this sense, inasmuch as the 

orientation is preserved under transformations of T. This orientation 

IS necessary, since otherwise there would be no way of distinguishing 

the (usually inequivalent) forms [a, b, c] and [-a, -b, -c]. The 

form is now completely described by specifying its oriented semicircle 
2 and its discriminant — Z>. 

An indefinite form f will be called reduced if the associated semi- 
circle intersects the fundamental region R considered earlier. Thus 
/is reduced if and only if the definite form v>t is reduced for some t 
The fact that any indefinite form is equivalent to a reduced form is an 
immediate consequence of the fact that vu for example, is equivalent 
to a reduced definite form: the transformation which carries ,pi into 
a reduced form also carries / into a reduced form. The difficulty lies 
in showing that each indefinite integral form is equivalent to onlv 
ni e y many reduced forms. To do this, we must first examine an 
important subgroup of r which is intimately connected with/. 
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1-8 The automorphs of indefinite forms. A transformation of T 
which leaves [a, 6, c] unchanged also leaves on and oja fixed. The fixed 
points of the transformation 

, az + 3 


z — 


72 + 5 


are those points w such that 


or 


acj + 

O) = > 

70) + 6 


70)^ + (6 — o-'lci) — /3 = 0. 



Suppose that the roots of this equation are o)i and 0)2. These num- 
bers are also the roots of the equation ao)^ + 60) + c = 0; since 
(a, 6, c) = 1, it follows that for some integer u, 

7 = aw, (18) 



Putting d + a = t, we have 


a 




t + hu 
2 ’ 


(19) 

(20) 


where t and u are such that 



e - 6V 


— |- acu 



y 


or 

- Du^ = 4. (21) 

Conversely, if t and u are solutions of (21), and a, 7, and 5 are 
determined by equations (18) through (20), then (17) reduces to 
u [aoP- + 6a) + c) = 0 and a6 - ^7 = 1 . This proves 

Theorem 1-7. The set of all automorphs of the primitive indefinite 
form [a, 6, cl is given by the set of all matrices 

(: 0 

ivith a, /3, 7, and S determined by equations (18), (19), ond (20), 
where t and u run over the integral solutions of the Pell equation (21). 
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Originally, automorphs were defined as substitutions giving z in 
terms of 2 ', while we have here used the inverse transformation giving 
z in terms of 2 . But \i F = AF A, then F = A-^FA-\ so that the 
inverse of an automorph is also an automorph, and the set of all 
automorphs coincides with the set of all inverse automorphs. This 
fact has much greater significance than in its application above. For 
since the product of two automorphs is again an automorph, the 
automorphs of / form a subgroup of T, which we shall designate by 
Ta (/) . (The elements of Fa (/) will be taken sometimes as transforma- 
tions and sometimes as their matrices. The ambiguity resulting from 
the fact that the matrices 4 and - ^ correspond to different substitu- 
tions in the form but to the same fractional transformation of F 
should cause no difficulty if the reader remains aware of it. ) Using 
well-known properties of the solutions of Pell’s equation,* r^(/) 
can be characterized as follows. ’ 


Theorem 1 8. r4(/) is the infinite cyclic group generated by the 

matrix 

y _ —cuq \ 

if A IS any automorph of f, then A = V" for some integer n, positive, 

negative or zero. Here to, Uq is the minimal positive solution of eoua- 
hon (21). ^ 


(The ambiguity mentioned above is exemplified here : every trans- 
ormation z = {az &)/ {yz + «) of F^ (/) can be made to have 
ma^x V , but the set of all substitutions which leave / fixed is given 


Proof: According to Theorem 1-7, F^ (/) is the group of matrices 


— hu) —cu \ 

\ au + bu) ) ’ 


— Du^ = 4, 


so it is to be shown that each of these matrices is a power of V 
If we put 


2(^0 + uo\/D)^ = _|_ Un ^/ D ) 
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for each n, then 

+ ^n+lV^) = V^) (^0 + Uq\/ D) 

= \{^otn + Du^Un) + \ {toUn + tnUo)^^, 

SO that 

~ 2(^0^n “h ^n+l = ^ (^n^O “1“ ^O^n)* 


Now suppose that 


yn+l _ / 2 CUn \ ^ 

\ CZ/n 2 ^^n)/ 


an assumption which is correct for n = 0. Then 


yn-¥2 _ 


^^n+l _ ^2(^0 “ ^^0) — CWo \ 

\ az/0 i(^0 + ^Wo)/ 


w f 2 CZ/ti \ 


2 (^n+l ^^n+l) 

2^ (zZo^n ~j- Z/71^0) 

^(^n+1 ^^n+1 ) 


— iciUotn + Ujo) \ 

^(^n+1 + ^'i^n-\-l)/ 

CUji^l \ ^ 

^(^n+1 + 6z/„_|_i)/ 


and by induction, V” is of the supposed form for all n > 0. Similarly, 
it can be shown that 

yn _ 1 hUn—l) CUn—1 

\ (i'^n — 1 — 1 ”f~ — l) 

so that F” is also of the supposed form for all n < 0. Hence the 
matrix corresponding to any solution of equation (21 ) is a power of F, 
and the theorem is proved. 

As usual, it is useful to know a fundamental region of r^(/). 

Theorem 1-9. Suppose that the perpendicular bisector Co of the 
segment I joining oji and C 02 fs mapped by V into the circle Ci. Then 
C\ does not intersect Co, o,nd the {infinite) region between them, 
together with C\, oji, and 032 , is a fundamental region of Ta if)- 

Proof: If the arbitrary transformation 

z' = T{z) = {az + p)/iyz + 5) 
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has the distinct fixed points Zi and 22? then by dividing z' — zi by 
z' — z% we get 

+ ^ — ^1(7^ + _ (g — 7^1 )g + (j 3 — 6 zi) 

z^ — Z2 g2 + ^ — ^2(72^ + 5 ) (g — 722)'2 + (|S — 622) 

g — 721 2 + (/? — 621)/ (g — 72i) 
g — 722 2 + (/? — 522)/ (g — 722) 

= ^ ~~ ^ ~ ~ 721 2 — 2i 

a — 722 2 — T~^{z 2 ) g — 722 2 — 22 

In the case at hand, T is the transformation 

y, 2' = 2 fa " ~ 

(iuqZ + ^(^0 + bu^) 
with fixed points wi and C02, and 

« - T 2 i ^ ^(<0 - bup) — auo{- h + ^D)/ 2 a to — VD uq 
oc- yz 2 ^ (to 

We put 


- buo) - auo{~h - VD)/2a to + VD 


Uo 


K = 


<0 - VD 


Uo 


to + Vd 


Uq 


and have for V the representation 




It follows that F" is the same transformation with K replaced by ; 
this could be used to give a second proof of Theorem 1-8. 

By its definition, K is a real number between 0 and 1. Since the 
perpendicular bisector Co of the segment I joining wi and C 02 has the 
equation \z — ojil = |2 — c^|, F” transforms it into je' — coi| = 
K^\z' — CU 2 I, as we see by taking absolute values in (22). If we put 
^ — X + ii/f the last equation becomes 

Cn- (x — wi)^ + = K}^{{x — 0)2)^ + y^)j n I 0 

and it is a matter of simple analytic geometry to prove the following 
assertions: for positive n, Cn is a circle with its center on the real 
axis, on the extension through coi of Z; it contains wi in its interior; 
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it lies entirely on that side of Co on which o>i lies; and its radius 
approaches zero as n increases. For negative n, the circles Cn lie on 
the other side of Co, contain C 02 , and close down on 012 as |n| increases. 
Some of these circles are shown in Fig. 1-3. The lightly shaded 
region ^2^(1), which is the region described in Theorem 1-9, is the 
set of points z such that 


K < 


2 — 0)1 
2 — 0)2 



and it is clearly transformed by V into the set Ra(V) of points 2 
such that 



Figure 1-3 
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which is the region between Ci and Cj, including Cg. In general, 

F" transforms into the region Ra(V’‘) between C„ and Cn+i, 

including Since the entire plane, excluding coi and W 2 , is covered 

in this fashion, and no point is in two such regions, any one of them 

together with coi and 012 , is a fundamental region of TxC/), and the 
proof is complete. 


We are concerned here only with the upper half-plane U; relative 
to this, a fundamental region of (/) is that portion of any one of the 
above regions which lies in U. 

In the next section it will be convenient to have slightly more 
freedom in choosing a fundamental region of r^(/). We get this by 
noticing that, instead of beginning with the line Co, we could have 
started with any member of the family of circles 




For fixed c > 0, a fundamental region Ra(c, 1) would then be the 

ring between the circle (23), which we might call Co(c), and its trans- 
form 




the argument given above carries, through with no change except for 

the introduction of a factor c in certain equations. Such a region is 
shown heavily shaded in Fig. 1-3. 


1-9 Reduction of indefinite forms. The semicircle 2 representing 
the form / is the upper half of the circle given parametrically by 



— wi 


0)2 


) = 0 <t< 00 . 


The generating automorph V, given by equation (22), changes 2 
into the upper -half of the circle 




0)2 



-Ki^ = --{tVKf, 0 <t< 00, 


wluch is the same circle with a different parameter. In other words 
2 IS transformed into itself by F, and hence by any element of Ta (/)! 
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in the sense that each point of 2 goes into some other point of 2, 
although no points of 2 remain fixed except oji and ^ 2 - In fact, that 
arc of 2 which lies in a fundamental region Ka{c, 1) is mapped by 
onto the arc of 2 which lies in the region Ra{c, T""), so that these 
various arcs are equivalent with respect to r^(/). Hence they are 
also equivalent with respect to the larger group T. 

Now imagine 2 drawn in Fig. 1-3. For suitable choice of c, the 
circle Co(c) defined in the last section intersects 2 at a point on the 
boundary of one of the transforms of R, and this is then also true of 
the equivalent point which is the intersection of Ci(c) and 2. The 
arc between these two points is thus broken up by the boundaries of 
the double triangles in Fig. 1-2 into a finite number, say m, of smaller 
arcs. If these short arcs are transformed back into R by suitable 
operations of F, then every point of 2 is equivalent to some point on 
each of these new arcs; in other words, there are precisely m elements 
of r which transform 2 into a semicircle intersecting R. Hence 
there are precisely m reduced forms equivalent to/. 

Theorem 1-10. There are only finitely many reduced forms in any 

equivalence class of integral primitive indefinite forms. 

Using the definition of reduced form, it is simple to characterize 
reduced forms in terms of their coefficients. For clearly (a, h, c] is 
reduced if and only if one or both of the points p and — p^ are inside 
the semicircular region bounded by 2, or if p is on 2. The points 
below 2 in U are the points z x + iy such that 

a(a(:r^ + y^) + bx + c) <0. 

Since p and — p^ have the coordinates 

1 V3 

we have that / is reduced if and only if either 

a(2a zhb + 2c) <0 or 2a - & + 2c = 0. (24) 

To find the set of reduced forms of the class containing a given form 
[a, b, c], the procedure outlined for definite forms may first be used 

to reduce <pi(x, y) — a/2{x — + a/2(x — = ax^ + bxy + 

D)y^/4:a; the transformation which reduces <pi(x,y) also 
reduces [a, 6, c], say to [a,, bu Ci]. Thus the semicircle 2, represent- 
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ing [ai, bi, cj intersects the fundamental region R of r, either in an 
arc or in the single point p. Starting from a point on Zj in R, move 
along 2i in the direction in which it is oriented. At the point at 
which 2i leaves R, it enters one of the regions 

R(S-^), R(S-^W), R(WSW), R(WS), R(W), 

R{WS-^), ft(FS-iTr), or R(S), 

since these are the only regions adjacent to R (cf. Fig. 1 - 2 ). If it 

enters iJfTi), then Ti~^ sends 2i into a new semicircle 22 (associated 

with [02,^2, C2]) which has an arc in R, and this arc is the image 

under Ti of the portion of 2i in R{Ti). The same argument can 

now be applied to Za, leading to a Zg (associated with [og, 63, cg]) 

which has an arc in R, and this arc is the image under T2~^Ti~^ of 

the arc of 2i next encountered in moving along 2i in the positive 

direction. If the process is repeated p times, 2i and [aj, bi, a] will 
recur. 

It is rather the exceptional case that 2 passes through p or -p^. 
If it does not, the array of possible transformations listed above 
simplifies: the only T’s to consider are then S, S~^, and IF. For 
example, consider the reduced form [2, -4, -1], where = 

(2 + y/Q)/ 2 , 0)2 = (2 — V 6 )/ 2 . 2i goes from R to R{W), so we 

make the inversion IF"' = IF, or z = -\/z'. This replaces [a, 6, c] 

by [c, -b, a], so here [02, bz, C2] = [- 1 , 4 , 2 ]. 22 goes from R to 

R{S), so we make the translation S~^, or 2 = 2' + 1. In general 
this replaces [a, b, c] by [a, 2a + 6, a + 6 + c], so here [ug, 63, cg] = 

[- 1 , 2 , 5 ]. Zg also goes from R to R{S), and we get (04, 64^ C4] = 

[- 1 , 0 , 6 ]. 24 alsogoesfrom/ 2 to/e(S),and[o 5 , 65, Cg] = f-’i , -2 51 
A final application of gives [a^, b^, ce] = [- 1 , -4, 2]. Since Zg 

goes from R to 12 (IF), we invert, to get [ay, 67, cy] = [ 2 , 4 , - 1]. Zy 
goes from RtoR (S~^ ), so we must make the translation S:z = z'— 1 . 
In general, this replaces [a, b, c] by [a,b - 2 a,a - b + c], so here 
l®8> Os, Cg] — [ 2 , 0 , — 3 ]. A second application of S gives [og, 69, cg] = 

^ [®i> ^ii oi], and we have the complete set of reduced 

ori^ for this class. If the algorithm were repeated indefinitely a 
penodic sequence of forms would arise; it is therefore meaningful to 
speak of the 'period of reduced forms. 

The following principle is useful in these calculations: If after a 
translation the inequality ( 24 ) is correct for just one choice of sign. 



32 BINARY QUADRATIC FORMS [CHAP. 1 

the next step is an inversion, while if it holds for both signs, the next 
step is a repetition of the translation. {S is never followed by 
nor W by Tl .) The reason for this should become clear upon looking 
back at the derivation of (24). 

Theorem 1-11. There are only finitely many classes of integral 

indefinite forms of given discriminant. ’ 

Proof: First consider the primitive forms; for them it suffices to 
show that there are only finitely many reduced forms of given dis- 
criminant A = —D. From (24) we get 

2a^ d= ah < — 2ac, 
so 

4a“ zh 2ab + — 4ac = D. 

But for each choice of sign, 4a^ zt 2ab + b^ is posL-ive definite; it 
therefore represents only positive integers unless a = 6 = 0, and by 
Theorem 1-6, each of the integers 1, 2, . . . , is represented in only 
finitely many ways. Hence there are only finitely many choices for 
a and 6, and for each choice, c is fixed by the requirement b^ = D+4ac. 
There are therefore only finitely many reduced forms, and hence only 
finitely many periods, and so only finitely many classes. 

If a class contains an imprimitive form, say with (a, b, c) = d, then 
every form in that class also has divisor d, so that the class consists of 
the elements of a class of primitive forms with smaller Z), each multi- 
plied by d. There are only finitely many such classes. 

PROBLEMS 

1. Find the period of reduced forms belonging to the class of 

+ Ixy -I- 

2. Show that Theorem 1-7 remains correct if the word "indefinite” is 
omitted, that is, if Z) < 0 (cf. Theorem 1-4). 

3. Show that there is just one class of primitive forms with D = 20, and 

one class of imprimitive forms. 

1-10 Representations. The discusoiun occurring be*.' 
rems 1-4 and 1-6 made no use of the definiteness of the . 
therefore equally applicable to indefinite forms. Thus Theoiv.*. - j 
can be recast as follows. 
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Theorem 1-12. Let f = [a, b, c] be a primitive integral indefinite 
form of discTtminant A, where D = -A is not a square. Let n be an 
integer. Corresponding to each minimum root m of the congruence 
(^Vb) determine Iby If none of the forms [n, m, J] is equivalent 

to f , there are no proper representations of n by f. If at least one of the 
new forms is equivalent t , f, there are infinitely many proper repre- 
sentations of n by f; they are given by the first columns of all the ma- 
trices A T, where A can 6. any automorph ± F" off, and T is any of a 

set of matrices, which replace f by the various equivalent forms [n m l\ 
each form being obtained from just one T, » » » 

PROBLEMS 

1. Discuss the proper representation of 13 by [1, 3, — ij. 

2 Show that the odd numbers properly represented by + 4xv - yi 
are those of the form ^ -^y y 

k 

c 5‘ n Pi"', 

*=»i 

lem'i ““ ■ S •■ < - (ct. Prob. 
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2-1 Introduction. With a few exceptions, the theory developed 
up to this point, both in this volume and in the preceding intro- 
ductory volume, has been self-contained, in the sense that the prob- 
lems, which had to do with the ordinary integers, were solved 
without going outside this system. When considering the distri- 
bution of primes and the theory of quadratic forms, we made use 
of the real and complex numbers, but not in an intrinsically arith- 
metic fashion. In the investigation of the representability of an 
integer as a sum of squares,* however, we had occasion to consider 
the arithmetic structure of the set of Gaussian integers, and to apply 
this to a problem involving ^ordinary integers. During the last 
century, it has been found that many problems in rational arithmetic 
are treated most naturally by introducing larger sets of “integers'" 
and deducing, from the structure of the extended system, information 
about the ordinary integers. Of course, as soon as a mathematician 
begins to work in a new medium, to use a metaphor from art, he 
finds interesting questions which have little or nothing to do with the 
original problem. In the present case, this tendency was instrumental 
in the development of modern abstract algebra, a large portion of 
which has only a tenuous connection with number theory. 

From the point of view of this text, general algebraic theory must 
take second place, the primary object being to give the reader an 
appreciation of the power afforded by the method, as well as a knowl- 
edge of some of the basic results in the subject. For this reason, the 
formulation will be kept as concrete as possible; there will be no 
striving for generality or abstractness for their own sakes. The 
treatment is self-contained, except for :he following two theorems, 
whose proofs can be found, for example, in L. E. Dickson, First 
Course iri the Theory of EquoHotis (New York: John W^iley Sc Sons, 
Inc., 1921), pp. 130-131 and 124-125, respectively. 

*See, for example, Volume I, Chapter 7. 
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The product of two determinants of the same order is another 
determinant of that order, whose element in row i and column j is the 

sum of the products of the elements of the ith row of and the corre- 
sponding elements of the ith row of D2. 


Symmetric Functiox Theorem. Any polynomial P(x, r i 

symmetric in and of degree g in each, is equal 'to a poly- 

nomial of total degree g, unth integral coefficients, in the elementary 
symmetric functions 


, X1X2 • ■ • Xn 

and the coefficients of P(xi, x„). In particular, any symmetric 
polynomial with integral coefficients is equal to a polynomial in the 
elementary symmetric functions with integral coefficients. 

If Pis apolynomial in the roots of an equation fix) = Oof degreen 

and leading coefficient 1 , and if P is symmetric inn — 1 of the roots 

then P is equal to a polynomial, with integral coefficients, in the re- 
maining root and the coefficients of f(x) and P. 

We shaU also have occasion to use the so-called Fundamental 

Theorem of algebra; this basic assertion is proved in the remainder 
of the section. 


Fundamental Theorem of Algebra. A polynomial f(z) = 

ao 2 " H h a„ having complex coefficients and positive degree, has a 

complex zero, (it follows immediately that it has exactly n complex 
zeros, in the sense that there are complex numbers ^1, . . . , such that 

f(z) = ao(z - ^1) ... (2 _ ^„).) 

Proof: Since the truth of the theorem depends on the structure of 
the complex numbers, it is necessary to use some properties of these 
numbers. If the entire theory of functions of a complex variable is 
assumed, the proof is very easy indeed : an analytic function has as 
many zeros as poles, and a polynomial has a pole at infinity, so it 
nmst have at least one zero. If less than this is assumed, it is rkson- 
able to ask that as little be assumed as possible. The proof to be 
given uses the fact that a real-valued continuous function of two real 
variables has a minimum value in any closed domain, and it assumes 

familiarity with the symbol Va, where a is real. (If DeMoivre’s 

theorem were used, to give meaning to for complex a, the proof 
would be slightly simpler. ) 
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With the second assumption, the quadratic formula provides a 
proof when n = 2 and the coefficients are real. To solve a quadratic 
equation with nonreal coefficients, it may be necessary to extract the 
square root of a nonreal number. Let the number be a + bi. Then 
the equation a + = (x + iy)^ gives 

a = and h = 2xy, 


or 

4x'* — 4ax^ — 6^ = 0, 

and we can take 

la + Va^ + b 

Before treating the general case, note first that we can write 
fix + iy) = G{x, y) + iHix, y), where G and H are polynomials in 
the real variables x and y, with real coefficients. It follows from the 
continuity of G and H throughout the xy-plane that \fiz)\ is contin- 
uous throughout the complex 3-plane, where z = x + iy. Moreover, 
forn > 0 and ao ^ 0 (which we henceforth assume), we have 

lim 1 /( 2 )! = «>. 


For if max (l<iol) • • • > l®nl) “ 


1/(2) 1 > lao2"l 

— (|a«l + 

\an-iz\ + • • • + M) 


/ riA ^ 

) for \z\ > 1 

> lao2"l 

\ lao2|> 

lao2”| 
^ 2 


/ 2nA 

i»i > ”» ( i„.i , 


Since |/(z)| is continuous, it assumes a minimum value at some point 
in any closed circular disk with center at 0, and since \f(z)\ becomes 
infinite with \z\, the disk can be chosen so large that this minimum 

occurs at an interior point f. We must show that |/(?)| = 0. 

We now proceed by induction : suppose that every polynomial (rf 
degree less than n, with complex coefficients, has a complex zero, and 
that / is of degree n and \fiz)\ assumes its nonzero minimum at 
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Suppose that / (0 = M, and put 

^(2) = = 1 + biz + • ■ • + b„z”; 

then \g{z)\ > 1 for all z. Define k as the smallest index such that 
bk 7^ 0, so that 

ff(z) = 1 + 6*z* + • • • + 6„z", k < n. 

First consider the case that k < n. By the induction hypothesis, 
the equation ’ 

1 -t- b*z‘ = 0 

has a root. Let jj be this root, and put z = S77, where 0 < 6 < 1. 
Then 

g{byf) = 1 + 6*5^ + bk+iS^~^^ri’‘'^^ b„d’'ri’' 

= 1 - 5* + H 1- 

Now if 16_,1 < B for A: < j < n, then 

<5(1-1- |i?|)”5*+i(l -f 5 -J 1- 

< 5n(l -f- h|)"5*=+i = C8’‘+K 

Thus 

I^(5i 7)| < 1 - S* -1- C«*+i = 1 - 5*^(1 - C8), 

and for 0 < S < l/C, < 1. This contradicts the assumption 

that 1 is the minimum of |{/(z)| ; hence M = 0. 

It k = n, then g{z) = 1 -|- b^z”. If n is even, then the equation 



IS solvable, by the induction hypothesis, and any root of it is also a 

rootof g(z)=0. Hence we can suppose that n is odd. Put 6„ = c-t-df. 

If c 0, we put z = - 6 sgn c (that is, z = 6 or - 5, according as 
c < 0 or c > 0), and obtain 

|l + (c -j- df)z”|^ = |l — Ids’* — S”di sgn c|^ 

= 1 - 21016” + (c?‘ + d2)52n. 

this last expression is again smaller than 1 for 8 sufficiently small and 

we have the same contradiction as before. * 
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If c = 0, then d ^ 0; moreover, a sign can be chosen so that 
= i. Then if ^ = ±i6 sgn d, we have 

1 + sgn d)”| = |l — 

and this is smaller than 1 for 6 sufficiently small. The proof is 
complete. 

2-2 Polynomials and algebraic numbers. We begin by making 

the following definitions. 

(a) R is the set of all rational numbers. 

(b) consists of R together with all polynomials in x with 
rational coefficients, the coefficient of the highest power of x being 
different from zero. 

(c) If a polynomial p(x) is in R[x], deg p means the exponent 
of the highest power of x occurring in p(x), if this is positive; if 
a 0 is in 72, deg a = 0, while if a = 0, deg a is not defined. 

(d) A polynomial p(x) in i?[x] is said to be monic if the leading 

coefficient is 1. ■ -j / \ 

(e) If pi(x) and P 2 (x) are in 72[x], we say that P 2 (x) divides pi(x) 

(in symbols, P 2 (a:)lpi(a:) : the phrase does not divide is indicated by 
the symbol “+”) if there is a g(x) in 7E[x] such that pi (x) - P 2 Wq{x). 
Under this definition, an element of R different from zero divides 
every element of 7?[x]. The nonzero elements of R are therefore called 

units of R[x]. 

(f) An element p(x) is said to be irreducible in /2[x] if it cannot be 

written as the product of two nonunit elements of i?[x]. 

By formalizing the ordinary process of dividing one poljmomial 

by another, it is not hard to show that if pAx) and P 2 (x) are in R[x\, 
and Piix) is not zero, then there exists a unique pair of elements q(x) 

and r(x) of R[x] such that 

p\{x) = P 2 (x)g(x) + r{x), deg r < deg P2 or r(x) = 0. 

This analog of the division theorem for integers* forms the basis for a 
Euclidean algorithm, by means of which a greatest common divisor 
(viix) , P 2 ix)) can be determined; the development is entirely 
pLallel to that for the integers, and leads to the following theorems. 

*See, for example, Volume I, Theorem 1-1 • 
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Theorem 2-1. Given two elements- piix), P2(^) of R[x\, not both 
zero, there is another element d(x) which is unique to within a unit 
factor and which has the following 'properties : 

(a) (:^(x)|pi(x) and d{x)\p 2 {x). 

(b) If di(x) is in R[x], and divides both pi{x) and poix), then 
di(x)\d(x), 

U P 2 (^)) = d(x)j there are elements q\{x) and q 2 {x) of 

R[x] such that 

Pi(x)qi{x) + P 2 {x)q 2 (x) = d(x). 

Theorem 2-2. An'y nonzero element of R[x\ can be factored into a 
prod'uct of irreducible elements of and this factorization is unique 
except for the order of factors and the presence of units. 

There is no loss in generality, and some gain in simplicity, in sup- 
posing that the various polynomials with which we deal are monic, 
since any polynomial can be made monic by multiplication by a unit. 
In this case the second part of Theorem 2-2 could be restated to read : 
The factorization of a monic polynomial into irreducible monic elements 
is unique except for the order of factors. 

We now consider the zeros of the polynomials of R[x]j or, what is 
the same thing, the roots of equations p{x) =0. If or is a root of the 

equation 

p{x) s X™ + -j 1- = 0. (1) 

where p{x) is in R[x] and n > 0, then ot is called an algebraic number; 
if p{x) is irreducible in R[x], a. is said to be of degree n. (The rational 
numbers are algebraic numbers, since if r is in /?, x — r = 0 has the 
root X ~ r. As algebraic numbers they are of degree 1, although when 
considered as elements of R[x\ the nonzero rational numbers were 
pven degree 0.) An algebraic number a is a zero of a unique monic 
irreducible polynomial in R[x\y called the defining polynomial of a. 
For if p{x) is not irreducible, it can be factored uniquely into irre- 
ducible monic factors, and oc must be a zero of one of the factors. 
Hence a satisfies some irreducible equation, i.e., an equation in which 
the left side is irreducible in R[x], If a satisfies two such equations, 
say p{x) =0 and q(x) = 0, then it also satisfies the equation d(x) = 0, 

where d(x) = (p(x),g(x)). For if 

p(x)si(x) + g(x)s 2 (x) = d(x). 



40 ALGEBRAIC NUMBERS [CHAP. 2 

then 

d{a) = Si (a) • 0 + 82 ( 0 :) *0 = 0 . 

But since p(x) and q{x) are irreducible, their monic gcd is either 1 or 
p(x). Since 1 0, (p(x), g'(x)^ = p(x), and p(x) = g(x). 

If p(x) in equation (1) is the defining polynomial of a, its n zeros 
Oil — « 2 ) ■ . . , cKn arc Called the conjugates of a. Except for an 

alternation in signs, ^he numbers ri, r 2 , . . . , are simply the 
elementary symmetric functions of ai = a, a 2 j • • • 

ri = — Hai = — (a + a 2 + * * * + Ofn), 

• • 

r 2 = JLciiia2 = aa2 + • * • + an-i^n, 
r„ = ( — l)”o:a 2 • * ■ ttn- 

As is the case here, we shall frequently use a Greek letter, both with 

and without the subscript 1 , to denote a single algebraic number.;' ' 

% • * 

% 

^ % 

Theorem 2-3. The sum, difference, and product of two Algebraic 
numbers are algebraic numbers. The quotient of two algebraic numbers 
is an algebraic number if the denominator is not zero. 

Proof : Suppose that a = ai and = fii have defining polynomials 

p(x) = x” + rix"“^ + ' ’ • + = (x — ot\){x — 012 ) • • • (x — a„), 
q{x) = x^ + six”*-^ + -‘ + Sm= (x - /3i)(x - /Sz) • • • (a; - J^m), 

respectively. Let yi, 72 ? • • • » 7nm be the numbers obtained by-*-' 
adding an a,- and a in all possible ways. Then the polynomial-. 
g(x) = (x - yi)(x - 72 ) * • * (^ " 7nm) ^^s, as Coefficients, sym-.;.. 
metric polynomials in the a,- and /3j-, with integral coefficients. Let. *■ 
one such coefficient be t(oii, , . . , ^ 1 , . . • » ^m)* As a symmetric.* 

polynomial in the ai it is equal to a polynomial in ri, . . . , rn> whose 
coefficients are themselves polynomials in 0i, . , . , fim with integral 
coefficients. These last polynomials are symmetric in 
they are therefore integral combinations of si, , . . , Sm, and conse- 
quently are rational numbers. Thus the coefficients of g(x) are 
rational numbers, and « + ^3 is an algebraic number. The same proof 
applies for a ' and a — /3, with obvious changes in the definition of 
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If a is algebraic and different from zero, so is 1/a, for the zeros of 
the polynomial 

r„x” + + • ' • + rix + 1 


are the reciprocals of those of 




T\X 


n — 1 


+ • • • + r„, 


and 0. Thus the assertion that a//3 is algebraic is a consequence 

of the fact that'A is algebraic. 

: 

• • * * 

..The properties of the set of all algebraic numbers mentioned in 
Theorem 2-3 are shared by many sets of importance in mathematics; 
so many in fact that the name field has been reserved to describe such 
sets. Technically, a field F is a set of two or more elements a, b, . . . 
together with an equivalence relation (which we designate by an 
, equals sign) and two operations (which we designate by the symbols 

“■”)) such that the following relations hold: 

Fob, any a and b in F, either a = b or a 9^ b. If a = 6, then 
= h + c and a • c = b ■ c, for every c in F. 

• - (h) The- elements form a commutative group with respect to the 
dpetation “ + the identity element being designated by “0”. In 
otheV words, if a, 6, and c are in F, then a + b is in F, a + b = b + a, 
a + {b + c) = (a + 6) + c, there is an element —a in F such that 
® ( a) H~ fl = 0, and ci-|"0 = 0“f“fl = a. 

(cr) The elements with 0 omitted (ti^hich we might call F*) form a 

commutative group with respect to the operation the identity 
element being designated by “1”. 

V (d) Multiplication is distributive with respect to addition; that is, 
■fi- Q) + c) = a • 6 + a • c for every a, 6, and cinF. 
j A.S long as one is working with a set of real or complex numbers, and 
'•ordinary multiplication, addition, and equality, one can show that 
the set forms a field just by showing .that if a and b are in the set, so 
are a dz 6, ah, and a/h if b 0; the other requirements are auto- 
matically fulfilled. Thus Theorem 2-3 is just the assertion that the 
set of all algebraic numbers is a field. Other familiar examples of 
fields are the set of all rational numbers, the set of all real numbers, 
and the set of all complex numbers. .(The integers, on the other hand' 
do not forin a field, since only the elements 1 have inverses, under 
multiplication, in the .system.) in fact, every field composed of 
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complex numbers together with the ordinary operations of addition 
and multiplication, contains the field R of rational numbers as a 
subfield. There are, however, fields with only finitely many elements. 
An example of such a field is the set of numbers 0 , 1 , . . . , p — 1 with 
the operations of addition and ipultiplicatiou modulo p; in this case, 
Q _j_ 5 is that element c such that a + h = c (mod p) ; a • b is that 
element d such that a • b = d (mod p) ; —a is 0 or p — a, according 
as a is 0 or not 0; if a 0, a"^ is that element /^uch that a • / = 

1 (mod p). 

The field of all algebraic numbers will play no role in the present 
discussion. We consider instead certain subfields of it, called algebraic 

nuftibcr fields, described in the next theorem. 

Let d be an algebraic number, of degree n > 1 , whose defining 
polynomial is p(x) as given in eciuation (1), and whose conjugates are 

Theorem 2 - 4 . The set of all numbers of the form 


(hW _ 
<12 W 



where qi(x) and q2ix) are in 7 ?[x] and (72(1?) 0 , is afield, which 

will be denoted by R{d). Every element of R{d) can be expressed 

uniquely in the form 

CK = ao + + * * • + fln— 1*?” 


where oq, ai, ■ - • , i 

Proof: The first part is clear, since the sum, difference, product and 
quotient of rational functions are again rational functions. 

Since q2{^) 9 ^ 0 and p(x) is irreducible, 92(2:) and p(x) are rela- 
tively prime, and for some t(x) and s(x) in 

i{x)v{^) + s{x)q2{x) = 1 - 
This gives s{^)q2{^) = 

“ = = s{d)qxW, 

q2{^} 

a polynomial in t?. Since p(t?) = 0 , 

It follows that every positive power of d can be written as a poly- 
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nomial in of degree n — 1 or less. The same is therefore true of 
every element a. If there were two different representations of a as 
polynomials in t? of degree » — 1 or less and with rational coefficients, 
their difference would be a polynomial of degree n — 1 or less which 
vanishes for x = t?, which is impossible. 

If a. is an element of the field described in Theorem 2-4, and 

+ • ■ * + an-\ ~ <p{^)y 

* K " 

then the numbei^ 

a = Of, a" = <p(r?2), . . . , 

are called the field conjugates of a. (They may not lie in the field 
described in Theprem 2-4.) Every field conjugate of a is also a con- 
jugate of a in the earlier sense, for if a has the defining equation 
g{x) = 0, then gQ>{x)) vanishes for x = t?, so that v{x)\g{ip{x)) and 

= 0- The converse is also true, as the following 

theorem shows. 

Theorem 2-5. The set of field conjugates of an element a. of R{^) is 
either identical with the set of conjugates of a, or consists of several 
copies of the set of conjugates of a. {Hence deg a|deg d.) The poly- 
nomial whose zeros are the field conjugates of a is a power of the defining 
polynomial of ot; if it is equal to the defining polynomial, then 

R{ot) = R{d), 

Proof: Form the^eZd polynomial for a : 

Jk 

fix) = (x - a){x ~ a") • • • (x - 

Its coefficients are symmetric polynomials in the a ’s, and are there- 
fore symmetric polynomials in ^ and so are rational num- 

bers. Factor /(x) into its monic irreducible factors in f2[x], say 

fix) — fi (2:) •/ 2 (x) • • • , 

and let/i ix) be a factor which vanishes for x = a. Then/i (vj(t?)) = 0, 
so P(x)|/i(^®(x)), and /i(x) vanishes at a', a", . . . , If these 

are distinct, /i(x) is of degree n, and /(x) is irreducible. If they are 
not, let a, a , . . . ^ Q,(0 be a maximal distinct set of a’s. Then /, (x) 

vanishes for some a®, so fy{x)\f^(x); since A (x) is irreducible, 

- c/iCx), and c = 1 since /i(a;) and fzix) are monic. If there 
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are other factors of / (x), the argument can be repeated, 
we find that 


fix) = 


Eventually, 


Since the zeros of /i(x), which ii^the defining^polynomial of a, are the 
conjugates of a, those of /(x^ (that is, the feld conjugates) consist of 
n/t copies of the set of conjugates of a. 

Now suppose that/i(x)i^ fix). Define 

<pix) = fix) H 77 + 

\_x — a X — a 

so that ipix) is a polynomial of degree n — 1 
Since 

ifia) = — oc") •*•(« — 

we have that the number 




is in Rioc), so that Rid) is a subfield of Ria), and Ria) = Rid). 


The last assertion of the theorerfi shows that if one field Riot) is 
a proper subfield of a second field Rid)^ then deg a < degd. For if 
deg a = degt?, then the field polynomial of a with respect to Rid) is 

irreducible, so that/i(x) = /(x), and Ria) = 

The field Rid) is called an algebraic number field; we say that Rid) 
is obtained by adjoining d to /?, and call Rid) a simple algebraic 
extension of /?, of degree n. This same field can be obtained by 
adjoining various other numbers to R] for example, R{2d) = Rid). 
If an element a of Rid) is such that i?(a) — Rid)j then a is called a 
'primitive element oi Rid). It is clear that the degrees of any two 
primitive elements are the same, and both are equal to the degree 

of the field. 

There is, of course, no reason why the process of adjunction cannot 
be repeated; one can start from Rid) and adjoin an algebraic number 
1 } to it by taking all rational functions of i? whose coefficients are 
elements of This new field is denoted by R{d)(7i), or more 

simply by Ridj tj). 

Theorem 2-6. If d and v are algebraic numbers, the adjunction of 
rj to Rid) gives the same field Rid, rj) as the adjunction of d toEiv)- 
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There exists an algebraic number f such that rj) is identical with 

R{i)- 




Proof: The first part is clear, since both ?;) and t?) are 
identical with the field qo^isting of all numbers of the form 


- Qi 


v) 


Q2(^, v) 



92 (»?, 9 ^ 0 , 

* 


where 9i(a:, 2 /)todfg 2 (x, y) are polynomials in two variables with 
rational coefficiOTiis. 

If V is an element of R{d), then R(d, n) = R{d), since a rational 
function of a rati<j^l ^nction is again a rational function. Assume 
then that d and ?? do not lie in the fields and R{d) respectively. 
Let their defininjg^olynomials be pi(x) and p 2 (x), and let their 
conjugates be and t/i, . . . , respectively. Let a and b 

be rational numbers, and let f = fi, . . . ^ be all expressions of the 
form + bqj,. Since the conjugates of d are distinct, as are the 
conjugates of there is only a finite set of ratios a /6 for which some 
two of the are equal, and we choose a and 6 so that a /6 is not in 
this set. Furthermore, we order fz so that f = ai? + 6 ??. 

Now put 


I \ 


fix) = (x - f,)(x - • • • (x - 


This polynomial has no multiple zeros, and its coefficients, being 
symmetric in the if’s and v% are rational. We show that Ri^,v) = 
^(f). It is clear that every element of i2(f) is in 72 (i?, , 7 ). Suppose 
on tm other hand that p is in R(i}, r,), and that 


qi(^, v) 
q 2 i^, v) 


92 (d, v) 9^ 0- 


Then we can define the numbers p = pi, . . . , pnm by the equation 

. _9l(dy, %) 

p* 71 T > 

rjk) 

where the same subscripts appear on d and v i^ the definition of 
as in the definition of f*, for f = 1, 2, . . . , nm. Now put 

✓ < 


P2 


/(x) ^ + 

a\x - ri a: - f 2 


+ 


X 


Pnm \ 

- fnm/ ’ 
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by the Symmetric Function Theorem, the coefficients of F{x) are 
rational. If z > 1, the polynomial 

•ii 

/(^) zrrv ^ - n-- - (x- Ti-i){x - f.+i) ■■■ (x - f„„) 

vanishes for x = f, and from the representation 

f(^) — ^ = p(^ - ^ 2 ) • (x - f„„) 

X j 

we have 


Fin = p(f - {*2) * • • (f - 

Since 

/'(f) = (f - f2) • * * (f - fnm) 0, 


this gives 


and p is in 7?(f). 


^(f) 

/'(f) ^ 



PROBLEMS 

1 . Prove Eisenstein’s irreducibiliiy criterion: a polynomial /(x) = 
ao + aix + • • "h a„z” with integral coefficients cannot be written as a 
product of two or more polynomials with integral coefficients and positive 
degrees, if there is a prime p such that 

P+a„, p\ai if i < n, and pHoo- 

[Hint: Suppose that there is such a p, but that /(x)*= gix)hix)t where 
gix) = 60 + 61 X + ■ ■ • + hix) = Co + cix + • • • 4- c.x*. It follows 
that p divides exactly one of 60 and co — say 6 o- Let 6 * be the first coefficient 
in gix) not divisible by p, and deduce a contradiction from the expression 
for ai in terms of the 6 ^s and c's.] As we shall see later (Theorem 2-21 ), 
irreducibility over the set of polynomials with integral coefficients implies 
irreducibility over R[x]. Use this fact in Problem 2. 

2. Show that the following polynomials are irreducible over R[x]: 

(a) x" — p, p a prime. 

(b) x^^ + x^~^ + - ' ■ "h X 4- 1 . [Hint: Replace x by x 4- 1.] 

(c) x^ 4" 3x^ 4- 4. 

3. Show that \/3) is identical with Riy/2 4 - VS), and find a ra- 

tional function r(x) with rational coefficients such thatr('\/2 4- V^3) — V^. 
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2-3 Algebmc integers. If the defining (monic) polynomial of an 

algebraic number ^ has integral coefficients, is said to be an algebraic 

integer. This is a direct extension of the notion of ordinary or rational 

mt^ers, which are the zeros of monic linear polynomials with integral 

coefficients. Hereafter we shall designate by Z the set of all rational 
integers. ^ 

Theorem 2-7. The sum, difference, and product of two algebraic 
integers are agct^ algebraic integers. 

The proof follows the lines of the proof of Theorem 2-3. 

Theorem 2-8. If ^is a zero of a monic polynomial with coefficients 
in z, men a is an algebraic integer. 

= *'' + •■• + »»» the polynomial, 

of T,eA ts the defining polynomial 

. l^t 6o be the lcm of the denominators of the reduced fractions 

elativety prime rational integral coefficients. Then q (x) divides f(x), 
thCtCoefficients in the quotient polynomial being rational, and we can 

/(^) __ cg(x) 

q(^) 

whye c and c' are so chosen that ^ (x) has relatively prime coefficients 

a(r\n(-^^ 5 ~ and the coefficients of the product 

of /Cxi w prime. * Since this is also true of the coefficients 

cSiiit ? 7 f • r 

and honce h. . ±,, wMch 

Theorem 2-9. // « is a root of an equation 

fix) = X" + ^ix^^ + . . . + = 0, 

• ^ f 

integer^^ * • • > are algebraic integers, then a is an algebraic 

field °R{^) ^ ^ simple extension 

of degree m, say. We can use the sets of field conjugates 

prec^dLg Tte'^oSm 3-T4, VdumeT^ 
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to form polynomials ^ 

/ 2 (x) = x" + + • • • + 

U(x} = x” + 

The product /(x)/ 2 (x) • ■ has rational integral coefficients and 

is monic; by Theorem 2-8, a is an algebraic integer. 

The set of integers in a fixed algebraic number field /?(t9) is also 
closed under addition, subtraction, and multiplication. We shall 
designate this set by and call it the integral domain of the field. 
In particular, 7?[1] = Z is the set of rational integers. 

Theorem 2-10. If 0 is an algebraic number^ there exists some 
rational integer a 9^ Q such that at? is an algebraic integer. If t? 
satisfies an equation + * ' * "b = 0, tVi which fio, . . . , 
algebraic integers, then Pod is an algebraic integer. 


Proof: Let the defining equation of t? be 

pix) = x” + + • • • + = 0, 


and let the lcm of the denominators of the fractions ri, . . . , r„ be a. 
Then the polynomial 



= x” + arix” ^ 





has integral coefficients and is monic and irreducible; its zeros at?,. 
at? 2 , . . . , at?n are therefore integers. The proof of the second part, 
using Theorem 2-9, is similar. 


Since R{d) and R{ad) are identical for a 0 in Z, any algebraic 
number field can be considered as the result of adjoining an algebraic 

integer to R. 

If t? is an integer, so are its conjugates t? 2 , . . . i The same is 

therefore true of its field .conjugates. 

If a is any element of the field R{d) of degree n, the product 
aa" * * • of all the field conjugates of a is called the norm of a, 
and denoted by Na (a more complete notation would be 

Theorem 2-11. The norm of an algebraic integer is a rational 
integer. 
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Proof: If a has the defining equation 


+ 


• • 


+ s 


m> 


then the norm of a (in any given R{^) containing a) is a power of 
by Theorem 2-5. 

Theorem 2-12. If a and p are elements of R{d), then 


my 


Proof: Put 


•V 


N(a^) = Na-N/3. 


n— 1 


oc — Oq + aid + • • • + 

* 

P = bo hid + • . . -f- bn—id^~^. 


(3) 


Then in the product a0, powers of d higher than the (n - l)th can 
be reduced using the equation 

(4) 

derived from the defining equation of t?. Also and can be 
obtained from (3) by replacing d by d*, and in the product 
higher powers of d* can be reduced by using (4) with d replaced by d*! 
Hence the field conjugates (ad) , {a^)" , . . . , (ad)^”^ of ad simply 
ad, a"d", . . . , Thus 


Nad = (ocfi)' (afi)" • • ■ (ad)*"^ 

_ -/- // . . . ^(n) ^ Na • Nd. 


— a a • • • oc 


- 0j f V he n elements of R(d)j with field conjugates 

oc , \ , . . , where k = 1, 2, , . . ^ n. The number 


A(a, . . . , = 


ff 

a 

... a'"> 

d" 

• 

. . . d'”> 

« 

• 

• 

n 

V 

„(n) 

• • • w 

» V. 

Its value 


the order of rows or of columns. 

Theorem 2-13. // a, /3, . . . , v are in then A(a, /?, . . «) 

t$ a rational integer. 
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Proof: If we take the row-by-column product, we have 






a 

... V 




« 


« 

■ 

• 




(n) 

• • • ^ 


• + ... av-\ 


av + • 





Just as in the proof of Theorem 2-12, 

ap + + * * • + = a/3 + (a^)" + * * • + 

and the sum of the field conjugates of an integer is itself a rational 
integer, by analogy with the proof of Theorem 2—11. Hence, .the 
number A(a, /S, . . . , i') can be written as a determinant with rational 
integral entries, and so is a rational integer. 

The numbers 1, t?, . . . , are said to form a basis of in the 

sense that every element of /?(??) can be expressed in a unique way as 
a linear combination of these numbers, with coefficients in R (cf. 
Theorem 2-4). We now examine the possibility of finding a basis for 
/?[t?] ; that is, a set of elements of i?[t9] such that every element of 
can be expressed in a unique way as a linear combination of them, 
the coefficients in this case being in Z. To emphasize the distinction 
between these two kinds of bases, the second is sometimes called an 
integral basis. Every integral basis is a basis of as is imme- 

diately seen from Theorem 2—10, but the converse is false. 

If Wi, . . . , C 0 „ is to be an integral basis, then for any p in the 

equation 

p = XiCJi + •••-!- Xn<*^n, 


and therefore also the equations 

p<*) = + ’ ■ * + A: = 2, . . . , n 

must hold for some rational integers If A (wi, . . . , w„) 5^ 0, 

this system of equations can be solved, giving each z, as the quotient 
of determinants, the determinant in each denominator being a 
square root of A(a,i, . . . , «„). It seems plausible that the smaller 
lA(a>i . . . , aj„)l, the better the chance of obtaining rational mteg ul 
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Xi. Hence, if an integral basis always exists, the next theorem ought 
to be true. 

Theorem 2-14. If wi, W 2 , . . . , Wn are any n integers of for 
which lA(a>i, aj 2 , . . . , Wn)l has its smallest possible value different 
from zero^ then coi, . . . , form a basis of 


Proof: Write 


n —1 

<^i = YL aijd^j i ~ 1 , 2 , . . . , n 


(5) 


where the aij are in R. Then 


A(coi, . . . , aj„) = 


0)1 


CO 


n 


0)1 


(n) 


CO 


n 


(n) 


n “1 

;=o 


n — 1 


• • E ■ anjd^ 


y=o 


n — 1 


E aijdJ 

y=o 


• • 


n — 1 

E anydfi^ 

j=o 


and this can be factored 


^(^ 1 ? ♦ * • ) ^n) “ I 


Id . . . 


n — 1 


1 dn ... 


^10 


ano 


^l,n— 1 . . . ®n,n— 1 


= (det laol)^A(l, i?, . . . , 


( 6 ) 


where t?, • • • , are the conjugates oft?. Since A(coi, . . . , £o„) 5 ^ 0, 

also det \aij\ 9 ^ 0 , and the system of equations ( 5 ) can be solved for 
the numbers 1 , t?, . . . , t?” giving linear expressions in coi, . . . , co^. 
Thus every number p of ^[«?] can be written in the form 


p = 61 C 01 4 * • • • + 


CO 


nt 


(7) 


where 5i, . . . , are rational. We must show that they are rational 
integers. 

If this is not the case for the p of (7), then some 6 ,* has a nonzero 
fractional part : 

~ [bi] + c, 

where 0 < c < and the symbol [b] means the largest integer not 
exceeding b. Put 


P ^ 1^1 "b ■ • ■ co)i “h * ■ • “f- bnO)fi, 
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In just the same way that (6) was deduced from (5), we can deduce 
from the system of equations 

coi = wi, 

0)2 = 0 ) 2 , 


a)v_i = 0)i__i , 

Pi = biO)i + 620)2 + ■ ■ • + Co)i + * • • + 6„0)n, 

o)i+i = 



= c^A(o)i, . , . , o)„). 

But this implies that the discriminant of the system coi, . , . , Pi , . . . , o)n 
is numerically smaller than that of o)i, . . . , o)n, and is not zero, which 
is contrary to the hypothesis that |A(o)i, . . . , Wn)| is minimal. 

Any two integral bases of a single held have the same discriminant, 
since each is the product of the other and the square of a determinant 
with integral entries, as in (6). The common value is called the 
discriminant of the field; we shall designate it by A hereafter. 


PROBLEMS 

1. Let I?, t?', and i?" be the roots of 

(a) a:® + 2 j: + 6 = 0, 

(b) X® — z* — X - 2 = 0. 

Compute the numbers — 2). 

Answer: (a) -206; (b) 4, 19. 
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2. (a) Let f{z) = aox" + * • - H~ On be irreducible over R, and let 
t?", . . . , be the zeros of /. Show that in 

t = i 

[This depends on the well-known factorization 

1 1 • • ■ 1 

Xi X2 • * * X„ 

. ! ! “ n {xi — Xj) 

l<j<i<n 

of a Vandermonde determinant.] 

(b) If in particular f{x) == x^ + px + q, show that t?*) = 

-27g2 - 4p3. 

3. Show that if ai, . . . , a„ are elements of Rli}] such that A(ai, . . . , ^n) 
is square-free, then ai, . . . , form a basis for 

2-4 Units and primes in R[d]. If a and p are in an integral 
domain we say that P divides a, and write p\a, if there is another 
element y of i2[t>] such that a = Py. An integer « such that €|1 is 
called a unit of R[d], We say that a and P are associates if a = ep, 
where € is a unit. 

Theorem 2-15. An element of is a unit if and only if its norm 
(as an element of R(^)) is ±1. 

Proof: If € is a unit and 

I™ + + . . . + = 0 

is its defining equation, then the defining equation of 1/e is 

x”' + . . . + ^ = 0. 

Cm 

Since 1/e is an integer, e„ = ±1, and N(l/€) is a power of the con- 
stant term in the defining equation of 1/e. (Alternatively, this result 
could be deduced from the multiplicativity of the norm. For if « is a 
unit, there exists an integer ei such that eei = 1. Hence 1 = N1 = 

Neei = Ne • Ne^, and since the norm of an integer is a rational integer 
Ne = ±1.) ^ ' 

Conversely, if the constant term in the defining equation of an 
element of R[d] is dzl, then the reciprocal of the element is also, an 
element of and the element is a unit. 
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The units of an integral domain form a multiplicative group, since 
the product of units is a unit, 1 is a unit, and each unit e has an inverse 
Cl such that eei = 1. 

In the domain of rational integers, the only units are zbl; in the 
Gaussian domain B[i\ the units are d=l, rbz. All these units are roots 
of unity, but in some domains there are units which are not roots of 
unity, and in fact do not have absolute value 1. This was pointed out 
in Chapter 8 of Volume I, but we can now go into details. 

Let c/ be a square-free rational integer, and consider the field 

R{Vd). As a basis for the field we can take 1, Vd, so that every 

element of R{\/d) can be uniquely expressed in the form a + h\/d, 

where a and b are in R. If 6 = 0, then a + h\/ d is an integer if and 

only if a is in Z. If 6 0, the defining equation of a + bVd is 

(a; — o — fix/ d) {x — a by/d) = 2ax + db 0,^ 

so that if a + bVd is in R[Vd], both 2a and - db^ must be rational 
integers. Hence {2a)^ - 4(0^ - db^) = 4db^ is also in Z; since d is 

square-free, it follows that 2h is in Z, 

Suppose that a = with k in Z. Then 

0 = 4a2 - = 4A:2 -t- 4/c + 1 - 4^6^ = 1 - 4db^ (mod 4), 

and it follows that 2b = 1 (mod 2), and d = 1 (mod 4). Conversely, 
if a and b are halves of odd integers and d = 1 (mod 4), the defining 

equation of a + 6\/d has coefficients in Z. Hence 1 and (1 + 'N/d)/2 

form a basis of R[\/d], if d = 1 (mod 4). 

If = 2 or 3 (mod 4), then a must be a rational integer. If b were 

of the form + |, with k in Z, we should have 

0 - 4a2 - 4db^ = - (4P -f 4A: + l)d = -d (mod 4), 

and d would not be square-free. Hence in this case both o and h 
must be in Z, and 1, Vd form a basis of 

Theorem 2-16. Let d be a square-free rational integer. Then if 
d = 1 (mod 4), the elements of R[y/ d] are either of the form 

a + 5\/d, a and h in Z, (8) 
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and the discriminant of d) is 




(1 + Vd) 

(1 - Vd) 



i-Vd)^ = d. 


If d ^ 2 or Z (mod 4 ), all the elements of R[V d\ are of the form (8), 
and the discriminant of R(y/d) is 




{-2Vdf = 4d. 


The units of R[y/d] are the integers e for which Ne=±l. Ifd = 2 

or 3 (mod 4 ), then e is of the form (8), so that the units are given by 
the solutions of the Pell equations 

- dy^ = ±1. (9) 

If d = 1 (mod 4 ), the units are the integers of the form (x + yVd) /2, 
where x + yVd is a solution of one of the Pell equations 

- dy'^ = ± 4 . (10) 

If d < 0 , these Pell equations have only trivial solutions: (9) has 

solutions zbl, 0 in all cases, and 0, =tl if d = -1, while (10) has the 

solution ±2, 0 always, and =tl, ±1 if d = - 3 . If d > 0, equations 
( 9 ) and ( 10 ) have infinitely many solutions.* 

Returning to the general domain 7 e[d], we say that an element w is 

prime if it is not a unit and has no factors other than its associates and 
units. 


Theorem 2-17. Every nonunit element of /2[d] can be written as a 
finite product of primes. 

Proof: If a in R[^] is not a unit, |Na| >1. If a is prime, we have 
the trivial representation a = a. If not, there is a factorization 
a ~ into nonunits, and Na = where 

1 < lN/31 < |Na|, 1 < INyl < lNa|. 

If either or 7 is not prime, it may be factored. The process must 
terimnate, since the rational integer Na has only finitely many 
divisors of absolute value greater than 1. 

This result is given in Chapter 8, Volume I. The solutions for given d 
can be found explicitly with the aid of Theorem 9-6 of that volume. 
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♦ ♦ ♦ 

To see that this factorization need not be unique, consider the two 
representations 

« * « 

21 = 3 • 7 = (4 + \/^)(4 - • • 

• .• ^ 
% 

of 21 in 7?[\/ — 5]. Since —5^1 (mod 4), the integers of this domain ’ 
are a + bV — 5, with a and h in Z, and the units are ±1. It is clear 
that no two of the numbers 3, 7, 4 + V — S, 4 — V-S are associates, 
and we can also show that all of them are primes in /?[V — 5]. Sup- 
pose that 

(ui “f" — 5)(fl2 H“ ~ 3. 

Then 

N (cti 4“ bi — 5)N (a2 + b2\^ 5) = N3 = 9, 

so that if neither factor is a unit, it must be that 

N(ai + biV^) = ai^ + 5bi^ = 3. (11) 

This equation, however, has no solution in Z. By a similar argument, 

7 has no proper divisors, since the equation 

+ 56^2 = 7 (12) 

has no solution in Z. Finally, an assumed factorization of either 
4 ± x/ — 5 leads to the equation 

N (ai + bi\^ — 5) • N (a2 + ^ 2 ^” ^) ~ 21, 

which in turn requires that either (11) or (12) hold. Hence 5] 

is not a unique factorization domain. 

A domain is called a Euclidean domain if for any pair of integers 
5 *^ 0 and a of there is an element 7 such that 

lN(a - ^y)\ < 1N/3|. 

In this case, there is a Euclidean algorithm by means of which a 
greatest common divisor can be defined, such that if (a, 0) — 5, 
there are integers 71 and 72 i*^ for which a7i + ^72 
this last property which is essential for unique factorization, since 
from it we get the result, equivalent to the Unique Factorization 
Theorem, that if fi\oty and (/3, «) = 1 then /9|7- For if 71 a + 72^ 
then 71^7 + 72/^7 = 7 ; hence fi\y. There is no such gcd in i 2 [x/^l. 
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For example, 3 and 4 + \/ — 5 must be considered as relatively 
prime, since they are nonassociated primes, but if we had 

3(a + bV-b) + (4 -h V-5)(c + dV^') = 1, a, b, c, din Z 

it would follow that 

3a + 4c — 5d = 1, 36 -|- c + 4d = 0. 

% 

# 

Subtracting the second equation from the first, \vc would have 

3(q - 6 + c - 3d) = 1, 
which is palpably false. 

Every Euclidean domain, then, is a unique factorization domain, 
although the converse is not true. The quadratic Euclidean domains 

are completely known : R[\/d] is Euclidean if and onlv if d has one of 
the 21 values -11, -7, -3, -2, -1, 2, 3, 5, 6, 7, if, 13, 17, 19, 21, 
29, 33, 37, 41, 57, or 73. 


PROBLEMS 

1 . Show that /?[p], where p = (“1 + iV^)/ 2 is a cube root of unity, is a 

Euclidean domain. [Compare Theorem 7-6, Volume I.] 

2. Find the gcd of 2 + p and 5 + 7p in Rip]. 

3. Show that if d is squaro-frce, and if A is the discriminant of R(\/d), 
then the numbers 1 and (A + \/A)/2 form a basis of /?[\/p]. 

2-5 Ideals. One way of restoring unique factorization consists in 
enlarging the set of possible divisors; we might for example try to 

find entities d., C, and D of /?[\/— 5] which are in some sense prime, 
and such that 

3 = AB, 7 = CD, 4 + \/ir5 = AC, 4 — V — 5 = BD. 

Then the two representations of 21 in R[V^] would no longer differ 
essentially ; instead we would have 

21 = {AB){CD) = {AC){BD) = ABCD. 

To accomplish this without going outside the domain, we make a shift 
of emphasis ; rather than asking for the divisors of a given number 
we look for all the numbers which have a given divisor. Here two 
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properties of the divisibility concept, in which the divisor is fixed, 
come to mind : 

(a) If y\aj then y\a\ for every integer X. 

(b) If y\a and y\j3f then y\a db p. 

In other words, the set of multiples of y forms an additive group which 
is closed under multiplication by elements of the domain (but not 
necessarily in the set). If a|/3, then the set of multiples of a contains 
the set of multiples of /3. The gcd (if there is one) of a. and P has as 
multiples the set of numbers of the form a + where a and run 
independently over the multiples of a. and 0 respectively, and this set 
is again an additive group closed under multiplication by elements 
of the domain. 

Because of the repeated occurrence of this special kind of set, we 
give the name ideal to any subset (containing at least one element 
besides zero) of an integral domain /?[i?] which forms a group under 
addition and is closed under multiplication by elements of the domain. 
Since there is no reason to suppose that every ideal of R[d] consists of 
all the multiples of a single element of we shall designate a 

general ideal by a capital letter. A principal ideal, consisting of all 
multiples of a given element a of the domain, will be designated by [a]. 
(It will be clear from the context whether the brackets designate an 
ideal or the greatest-integer function. ) But instead of a single number 
a, we could begin with any finite set aj, . . . , ocm of elements of R[d]f 
and form all expressions 

XlCKi + ^ 2^2 + • * * + 

where Xi, . . . , X,n run independently over the set of such 

expressions again forms an ideal, which will be designated by 
[ai, . . . , ocm]. (The numbers ai, . . . , are called generator z of the 
ideal [ai, . . . , am]*) This notation is similar to that for the gcd, if 
such exists, except that instead of writing (a, /3) = 7 we would now 
write [a, p] = [y]. (Two ideals are said to be equal if they consist of 
the same numbers.) It will be shown later that is a unique 
factorization domain if and only if every ideal of i2[t?] is a principal 
ideal. This should not be surprising, since this latter condition 
simply requires that any two elements of R should have a gcd in 

which can be expressed as a linear combination of the elements. 

Theorem 2-18. If RW of degree n, and A is an ideal of RW, 

then there exist elements ai, ••• t of R\p] such that every element of 
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A can he uniquely represented in the form 

kiai + • • • + A'l, . . . , A'^ in Z. 

Remark: Note that the A'’s are rational integers, and not elements 
of R[§]. The numbers ai, . . . , of the theorem are called a basis 
of A ; they may be taken as a set of generators of .1, but may not be 
the smallest such set. 

Proof: If the polynomial defining an element a ^ 0 oi A is p(x), 
then for some h, the zeros of p^ (x) are the field conjugates of a, so that 

P^(x) = x” + + • ■ • ± Na, 

and Na = + - • ^)oc is in A. Hence A contains a 

rational integer different from zero, and therefore a smallest positive 
integer, say a. If . . . , is an integral basis of R[d]j then A con- 
tains api for each Let an be the smallest positive rational integer 
such that the number 

Oil = aiipi 

is in Since A contains anpi and ap 2 , it contains numbers which 

are linear combinations of pi and P 2 with coefficients in Z. Of these 

there is one (not necessarily unique) for which the coefficient of P 2 is 
positive and minimal. Let it be 

«2 = a2lPl + a22p2- 

Similarly, for « = 3, . . . , n, put 

OCy = a,.lPl “b Cly2P2 “b ■ • • “b O-yyPy, 

where a„,- is m Z for 1 < t < v and a„„ is positive and minimal for a, 
m .4. It is asserted that ai, . . . , q[„ form a basis of A. 

Suppose that 


« - cipi + • • • + Cnp„, a, . . . ,c„ in Z, 

IS in Then so also is Of - ca„ for every c in Z. Since 


0<c„-a„„ T-EilI 

LUnnJ 


It follows from the minimality of a„„ that in the representation of the 
number a - [c„/o„„]a„, the coefficient of p„ is 0, so that 


On 

“ ~ “ «n = diPi + • • • + d„ 

U^nnJ 


— , dn in Z. 


♦ « « 
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Repeating the argument, we find that 


LOnnJ L<^n— l,n— iJ 


Gin— I = CiPi + • • • + 2Pn— 2» 


6i, . . . , 2 


After n steps, we have 



the desired representation. 

If there were two representations of the same number, their 
difference would be a nontrivial representation of 0 : 

kiai + * ■ * + knOin — ki^ + * * * + kn^ > 0 . 


But then also 

-f" * * • "h kriGCn^^^ ~ 771 = 1, 2, . , . , 

which implies that A(ai, . . . , ^n) = 0, contrary to the equation 

A(ai, . . . , an) = * * * ann^^(Pl* • • • > Pn) 0* 

The proof is complete. 

From their definitions, it is clear that each coefficient a** is positive 
and not larger than a, the smallest positive integer in A. We would 
like to show that bounds can also be put on the other coefficients 
1 ^ ^ We have 


ai = aiiPi, 

a2 = a2lPl "b ^22P2j 

Gts = asiPl + <l32P2 + a33P3» 



an = flnlPl + ^2P2 + anZPS + * ' ' + ®nnPn. 

Theorem 2-19. Every ideal in has a basis ai, . . . , on, given 
by (13), m which the numbers a^ are rational irUegers with 

0 ^ Oij < Cyy < an- 

Proof: It is clear that any system of numbers ai, . . . , 
a,- - fcay, 0 ^ 1 , an, in which fc is a rational integer and j ^ h is 
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a ~ kiai + k20i2 + * * * + knctn^ ki, . . . , in Z, 

then 

a. ~ kiai + • • • + (A'> + kki)cicj -f • . • + ki{ai — kaj) + * * * + knocn- 


In the set of equations ( 13 ), subtract a suitable multiple of cc^^i from 
anj so that the new coefficient of p„_i is non-negative but smaller than 
On— i,n— 1- Then subtract a suitable multiple of an— 2^ so that the new 
coefficient of pn-2 is smaller than an— 2.n— 2; this does not disturb the 
coefficient of Pn— 1. Continuing the process, we come eventually to a 
basis element an^ such that 0 < an/ < anj for i — l,.,.,n— 1 . 
Then we change a„_i by subtracting ofT suitable multiples of an_2, 
an-3, . . . , ai, etc. The result is a basis as described in the theorem. 

Corollary. A positive rational integer occurs in only finitely many 

ideals of 

This follows immediately from the theorem, for if a is in A, then 
^ a, and there are only finitely many sets of coefficients a.y 
satisfying the conditions of the theorem. 

The discriminant of the elements of a basis of an ideal is called the 
discriminant of the ideal; its value is independent of the choice of 
basis. For if ai, . . . , an and a/, . . . , are bases of A, then there 
are hki in Z such that 


n 


and 


Hence 


otA: = kkiot/ j k = 1, . . . , n, 

1 = 1 

det \hki\ ^ 0 . 

^(^l> • * * » ^n) (det (^^1 ) • ■ • » )| 


so that the discriminants have the same sign and 

A(ai , . . . , )|A(ai, . . • , an). 


By symmetry, 


A(ai, . . • , o^n)|A(ai^, • • • » )i 


and the discriminants are equal. 
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PROBLEMS 

1. Show that every ideal in Z is principal. 

2. If A = [p, a + bVd] is an ideal of R[y/d\, where p is a rational prime 
and d is a square-free integer not of the form 4A: -f 1, show that p and 

a + {b — p[b/p])y/d form a basis for A. 

2-6 The arithmetic of ideals. Ideals are special kinds of sets of 
elements. The emphasis so far has been on the elements comprising 
the sets. The whole power of the theory of ideals, however, lies in 
considering them not as collections of elements, but as entities in their 
own right, which can be combined according to certain operations. 

The first of these operations is multiplication. If A = [ai, . . . , ar] 
and 5 = [di, • ■ • , ds], then the product AB is the ideal 

[aidl, • • • ) «lds, «2dli • • • J “rdsl- 

The product ideal does not depend on the representation chosen for 
A and B. To show this, let AB = C, and suppose that also 

A = [a/, . . . , ap], B = [dl^ • • • , d«']- 

To keep matters straight, designate these la.st ideals by A' and B', 
even though they are equal to A and B. We must show that every 
element of C is also an element of A B = C , and conversely. 

First of all, a/ is in A and d/ is in B, so that we can write 

a/ = Xiai -t- • • • -1- Xrttr, 0/ = Midi + • • • + M>d.- 

Hence the number 

ai'0/ = 'LbkPiak^t = 'Emo‘k0l 

is in C for 1 < t < r', 1 <j < s'. Since C is an ideal, every linear 
combination of the numbers a/p/ is in C ; thus C' is a subset of C. 
Hence C = C , by symmetry. 

Theorem 2-20. If A is an ideal of B[t?], there exists an ideal B of 
such that AB is a principal ideal [a], where a is in Z. 

Remark : It is this theorem which is the crux of the whole matter. 
As indicated in the discussion at the beginning of Section 2-5, we are 
trying to enlarge the set of possible divisors of an integer by introduc- 
ing ideal elements. Given any such divisor, there should certainly be 
a second divisor whose product with the first is the original integer. 



2-6] THE ARITHMETIC OF IDEALS 63 

Since we have taken divisors as sets, we must identify the original 
integer with the set of all its multiples. It should be noted that all 
the associates of a given integer generate the same principal ideal. 

Proof: Suppose A = [ai, . . . , ar], and put 

f{x) = «1 + a 2 X + * ' * + OCr^^ k 

By representing , ar as polynomials in i?, and replacing i? in all 

the polynomials by . . - , in turn, we get sets . . . , 

where = 2, 3, . . , , m. We define 

g{x) = n (a/*" + + • • • + 

y=2 

== /3i + 02X + • • • 4" /3s.r* k 

The /3's are symmetric polynomials, with rational integral coefficients, 
in all the conjugates of ai, . . . , except ai, . . . , cxr themselves. 
Hence they are polynomials in ai, . . . , a,., with coefficients in Z, and 
therefore are in It is asserted that the ideal = [/3i, . . . , /3s] 

satisfies the conditions of the theorem. 

Put 

fix)g{x) = 71 + 72^ + • • * + Tr+s-lX^"'"®'"^. 

Since each 7 is a symmetric polynomial, with rational integral 
coefficients, in each ai and its conjugates, the 7 ’s are themselves 
rational integers. Let their gcd be a. Then a can be represented as a 
linear combination of 71 , • • • > 7r+8— n with coefficients in Z; since 
7i, . . . , 7rH-8-i are obviously in AB, a is in AB, and so [a] is a subset 
of AB. 

If we knew that a divides every product ai(3j, then we would know 
that every element of 4/3 is contained in [a]. The proof will therefore 
be complete when we prove Theorem 2-21, which is A. Hurwitz' 
extension of a theorem due to Gauss. 

Theorem 2-21 . Let 

A (x) = aox^ + ■ * ■ + «r, Bix) = PqX^ + ’ ' * + 

where ao/3o 7 ^ 0 , he polynomials with integral algebraic coefficients. 
If an algebraic integer 5 divides every coefficient of 

C{x) = A{x)B{x) = coX* + • • • + C(, 
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in the sense that each quotient Ci/b is an algebraic integer ^ then 5 also 
divides every product a* • 

Proof: First we prove a lemma : if 


f(x) — + * * * + 5o 7*^ 0, 

is any polynomial with integral algebraic coefficients and a zero p, then 
/(x)/(x — p) has integral coefficients. The proof is by induction on u. 
If u = 1, then/(x) = Sqx + 6i, and ^ 

f(^) ^ ^ g 

X — P X + 5i/6o 

is an integer. Suppose the lemma true for all polynomials of degree 
less than u. Then the polynomial 

Q(x) = fix) — 6ox“”^(x — p) 


has integral algebraic coefficients (by the second part of Theo- 
rem 2-10), and has degree less than u and vanishes for x = p. By 
the induction hypothesis, * 

Qix) fix) 


X — p 


X - p 


— bnX 


u-1 


has integral algebraic coefficients, and the same is therefore true of 
/(x)/(x — p). The lemma follows by the induction principle. 

By repeated application of the lemma, we deduce that if /(x) = 
5^(3 ; _ Pj) . . . (x — p„), then any product 60P1 ■ • * p* is an integer. 

Returning to Theorem 2-21, suppose that 

Aix) = «o(^ “ Pl) • * * iX ~~ Pr)} 

Bix) = fioi^ ~ O'!) * ■ • (x — (Ta). . , 

By assumption, the polynomial 

C(X) ^0^0/ f'T fT ^ 

= — - — (x — Pi) • • • (x — <Ta) 

has integral coefficients, and it follows that any product 
ao/3o |1 <n, <n2 <--- <n. <r, 

Pni • • • Pni^mi • • • 


1 < Wi < 7n2 < * ‘ ‘ < 


is an integer. Since and ^///Sq are elementary symmetric. 


(14) 

func- 
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tions in the p’s and (j’s, respect i\ ely, the number 


05 


_ ^o/^o ^ £l 

5 d ao l3o 

is a sum of terms of the form (14), and is therefore an iTitegcr. 4 he 
proof is complete. 

Theorem 2-22. If .IC = BC^ then A = B. 

Remark: Note that there is no zero ideal. 

Proof: Let D be an ideal such that CD = [e], a principal ideal. 
Then ACD = BCD, so .4[e] = B[e]. Thus e times any element of .4 
is equ 1 to c times some element of B, and A = B. 

If A = BC, then we say that C divides A, and write C\A. 

Theorem 2-23. A\C if and only if every element of C is in A. 

Proof: If A = [ai, . . . , ad and B = [ffi, . . . , fisl, then AB ~ C = 
[. . . , ai0j, . . .], so every element of C is in A, and also in B. 

Co iversely, suppose that every element of C is in A. Then e\'ery 
element of CD is in AD, for every D. Choose D so that .ID = [e] is 
principal, and let CD = [(Xi, , <Tt]. Then for each i with 1 < f < ^, 

<Ti = e\i for a suitable integer X,-. Hence CD = [e][Xi, . . . , X^ = 
AD[Xi, . . . , Xd, and by Theorem 2-22, C = A[Xi, . . . , X,], so that 
AlC. 

Theorem 2—24. An ideal is divisible by only a finite number of 

ideals. 

Proof: If the ideal is A, choose B so that AD = [c], where c is a 
positive integer. Then c is in A and in every divisor of A, and by the 
corollary to Theorem 2-19, there are only finitely many such ideals. 

A common divisor of A and B which is divisible by every common 
divisor is called a greatest common divisor (gcd) of A and B. 

Theorem 2-25. Every pair of ideals A and B has a unique gcd, 

(A, D). It is composed of the numbers a + /3, where a runs over A 

and ^ over B. 

Proof: Lf . D be the set described in the theorem; it is clearly an 
ideal. Since 0 is in A and B, D contains every element of A and of B, 
and so is a divisor of A and of B, Any common divisor of A and B 
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contains all the elements of A and of B, and since it is closed under 
addition, it contains all numbers a + /3, and so divides D. 

If D' is also a gcd of A and B, then D and Z>' are divisors of each 
other, and so each contains the other. Thus D = D . 

If the GCD of A and 5 is [1], we say that A and B are relatively prime. 
As an immediate consequence of this definition and Theorem 2-25, 
we have 


Theorem 2-26. If A and B are relatively prime, there exist a in A 
and in B such that a + /? = 1. 

Theorem 2-27, If A\BC and A is prime to B, it divides C. 

Proof: Choose a in A and in i? so that a + = 1, Then if 

7 is in C, ay + = 7 , and 0y and ay are in A, so that y is in A. 

Hence A\C. 

If A has no divisors except itself and [1], then A is said to be prime. 

Theorem 2-28. Every ideal can be represented as a finite product of 
prime ideals, and the representation is unique except for the order of 

fcLctors. 

The finiteness of the representation follows from Theorem 2-24, and 

the uniqueness from Theorem 2-27. 

In particular, it follows that the principal ideal generated by any 
element of /?[??] has a unique factorization into prime ideals of R^. 
If these prime factors are themselves always principal ideals, we might 
expect that ideals can be dispensed with entirely, and that there is 
then unique factorization of the numbers themselves. 

Theorem 2-29. A necessary and sufficient condition that he a 
unique factorization domain is that every ideal of Rid] be a principal 

ideal. 

Proof: Uniqueness of factorization in is equivalent to the 

property ; 

if a\Py and a and ^ are relatively prime, then a\y. (15) 

For if the domain has this property, unique factorization can be proved 
in the usual way, while if factorization is unique and a.\fiy, then every 
prime tt dividing « must occur in the factorization of /Sy ; since this 
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factorization is the product of the factorizations of /5 and 7 , if tt does 
not occur in it must occur in 7 . 

Suppose that factorization is unique in R[d], so that (15) holds. 
Then if tt is a prime number, [tt] is a prime ideal. For if [tt] = AB, 
where neither A nor B is [tt], there would exist an a in .4 and a /3 in B, 
neither of which is divisilde by tt, while their product is. 

Let P be any prime ideal, and a = tti"^ . . . Tr^"'' any element of P. 
Then 

[a] = . . . [wrY'^'y 

and since a is in P, so is every element of [a], whence P\[a] and P is 
one of the principal ideals Since every prime i<leal is principal, 

every ideal is principal. 

Now suppose that every ideal in R[§] is principal, and that a and /3 
are relativel}" prime. Then [a, /3] = [ 7 ], for some 7 , and every linear 
combination Xa + m/ 3 is a multiple of 7 . Taking X = 1 and m = 0, 
we have y\a; for X = 0 and m = 1, we obtain y\l3. Hence 7 is a unit, 
[a, /3] = [1], and we can take 7 = 1 . Thus there are X and ^ such 
that Xa + m/3 = 1 , so that if ocl/ 37 , then a divides Xa 7 + m /37 = 7 , 
and (15) holds. Hence factorization is unique. 

PROBLEMS 

1. Using Theorem 2-21, reformulate and prove the new version of 
Eisenstein’s irreducibility criterion, as given in Problem 1 , Section 2-2. 

2. Show that \i A = [ai + -f d\ is an ideal of P[\/ d], then 

the product of A with its conjugate ideal A' = [ai — tiV d, a^ — h^'V~d\ is 
principal. 

2-7 Congruences. The norm of an ideal. Two elements a and p 
of P[t?] will be said to be congruent modulo an ideal A if their difference 
lies in A, that is, if A divides the ideal [a — P]. This is a natural 
extension of the earlier notion of congruence of rational integers, if 
the modulus m is identified with the principal ideal [m]. The familiar 
properties of congruences are easily seen to hold. 

For fixed a, the set of all elements of P[??] which are congruent to a 
modulo A is called a residue class modulo A, 

Theorem 2-30. There are only finitely many residue classes 

modulo A. 
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Proof: Choose B so that AB = [c], where c is a rational integer. 
Then ai ^ a2 (mod .A) implies that ai ^ ^2 (mod [c]), since A\[c] 
and therefore A contains [c]. So if we can show that there are only 
finitely many elements, no two of which are congruent modulo [c], 
the theorem follows. But this is an immediate consequence of the 
fact that in the basis representation 

a = riO)i + • * * + TnOint 

where oji, . . . , form an integral basis of each of the rational 
integral coefficients ri, . . . , r„ has only c possible values modulo c, 

and that if 

n = r/ (mod c), t = 1, . . . , n, 

then ^ 

riO)i + * * * + TnOin = + • * * + Wn (mod [c]). 

The number of residue classes modulo A is called the norm of A, 
written N A . For the time being, it is necessary to distinguish between 
Na and N[a], the norms of the number a and the ideal [a], respec- 
tively. However, we shall soon see that the two quantities are essen- 
tially the same. 

Theorem 2-31. If RW has discriminant A, and A is an ideal of 
having discriminant A (A), then 

A(A) = (NA)2a. 

Proof: Let ai, • • • , an be the basis of A described in Theorem 2-19, 
and let pi, • * • j Pn be a basis of Then 

A(A) = (ail * ' * 

and we must show that NA = an • • • ann] f-bat is, tha^ 
an . . . ann numbers of R[d]. no two of which are congruent ^iodul^ 
and such that every element of RW is congruent to one of them. We 
show that this is true of the numbers 

riPi + ' * ' + rnPm 

1_ A ^ ^ ^ fr»r = 1 n If two of these numbers are 

where 0 < r* < a** lor ac — x, . . . , n. n 

congruent, say 

-f- . . . + rnPn = ^I'pl -[-•■■+ Pn (mod A), 

and r„ > rn', then 

(ri - ri')Pi + • • ■ + (^n - rn')P« ^ ^ (mod A). 
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But ann is the smallest positive rational integer for which any number 
of the form 

SlPl + • • • + S;j_lPn— 1 + dnnPn 


is in A; since 0 < < a„„, it follows that r„ — 

Similarly, r„_i == , ri = r/. 

If 

^ = SiPi + • * • + SnPny 


then 





where 0 < bn < o,nn- By iteration, 



^iPi + • * * + bnPny 


where 0 < < aj^k for k = 1, . . . , n, and 


= bipi + • • • + hnPn (mod ^). 

Corollary. N[a] = |Nal. 

For api, . . . , orPn is clearly a basis for [a], and 

A(api, . . . , apn) =: (Na)^A, 

so that (Na)^ = (N[a])^. But N[a], being the number of residue 
classes, is positive. 


Theorem 2-32. If A and B are ideals, then there is an a in A 
such that ([a], AB) = A. 


If such an a exists, then clearly [a] = AC, where {B, C) = [1]. 
If we rephrase the theorem, its close relation to Theorem 2-20 
becomes clear: given two ideals A and B, there is a C such that is 
principal and (B, C) = [1]. 


Proof: Let Pj, . . . , P,. be the distinct primes dividing AB, and let 


Put 



a > 0 . 



n 

i<j<^ 
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Since (Z>i, . . . , Dr) = [1], there are numbers 6,- in Dj, for t — 1, . . . , r, 
such that 

+ * ■ * + = 1. 

Then [5i] is divisible by Di, and therefore by P* for k 9 ^ i, and there- 
fore not by Piy since 1 is not. Now let ai be an element of P/* which 
does not occur in for i = I, ... ,r, and put 

a = ai5i + • • * + ccrBr. 

Then for each z, every term but one in this representation of a occurs 
in while the remaining term occurs in P,®* but not in P,®*'^h 

Hence ^l[a], but 

Theorem 2-33-. The congruence 

= f3 (mod A) 

is solvable if and only if Z)1[0], where D = ([a], A). The solutionj if 
it exists j is unique modulo A/D. 

Proof: If $ is a solution, then - /3 = t is in A, and therefore in 
D. Since also is in D, it follows that is in D, so D\[fi]. 

If is in Dj then it is the sum of an element of [a] and an element of 
A; that is, = a? + 6. Since 5 = 0 (mod A), (mod -4). 

If = /3 (mod then «($ - S') = 0 (niod A). Hence if 

[a] = DAi and A = DA 2 . then {Au A 2 ) = [1] and 

DA2\DAi[^ - 5 '], 

A2\Ai[^ — 

$ = (mod . 42 )* 

Theorem 2-34. N(4^P) = Nd. • NP. 

Proof: By Theorem 2-32, there is a 7 such that 

([ylAB) = A. 

Let NA = ni, NP = ^ 2 , and let aj, . . . , ««! and /3i, . . . , be com- 
plete residue systems modulo A and P, respectively. We shall show 
that the riiria numbers a.- + 7^y form a complete residue system 
modulo AB. 
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+ y0j = o^k + (mod AB), 

then 

7(id> — ^i) = (mod AB), 

and by Theorem 2-33, ([ 7 ], AB)\[a)c — ai], so that A\[afc — ai]. But 
this gives a* = ai (mod T), so A' = i. Moreo\'er, 

7 (d; - fil) = 0 (mod AB), 

( 3 j — = 0 (mod ^), 

j = I 

To show that every integer 6 is congruent to one of the above 
numbers, choose a,- so that 6 = (mod A). Then the congruence 

= d — ai (mod AB) 

is solvable, since ([ 7 ], AB) = ^ is a divisor of [6 — a,]- Finally, $ is 
unique modulo B, and can therefore be taken to be one of the num- 
bers Pj. 

Theorem 2-35. NT is an element of A. 

Proof: li ai, ... j c^na is a complete residue system modulo T, then 
so is O'! 1, . . • , “H 1- Hence 

0=1 + * * * + ofNA = ("1 + 1 ) + * * • + («NA + 1 ) (mod A ), 

0 = NT (modT). 

Corollary. There are only finitely many ideals of given norm. 

For by the corollary to Theorem 2-19, a positive rational integer 
occurs in only finitely many ideals. 

problems 

1. Show that if P is a prime ideal of Pli?], the congruence 

+ • • • + «,„ = 0 (mod P) 

with coeflBcients in P[t?] has at most m incongruent solutions modulo P. 

2. Show that if P is a prime ideal of P[??], « is an element of P[t?], and 
P|[al, then 

^ = 1 (mod P). 
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2-8 Prime ideals 

Theorem 2-3(). If NA is primej so is A. 

This follows immediately from Theorem 2-34. 

Theorem 2-37. There are infinitely many -prime ideals P in any 
domain Each such P divides exactly one rational prime p, and 

NP = pf where f, called the degree of P, is a positive integer not 

exceeding the degree of R{0). 

Proof: Let p be a rational prime, and let P be one of the factors of 
[p] in R[d]. Then if P also divided the ideal defined by another 
rational prime p' , it would divide their gcd, which is [1]. Hence each, 
P divides at most one p, and each of the infinitely many rational” 
primes p is divisible by at least one P, so that there must be infinitely 

many P’s. 

Now let a be a rational integer such that P\[a]; by Theorem 2-3.5, 
we could take a = NP. If a = pi ■ • • pn then 

P\[Pi]'-[Prl 

« 4 

• ♦ 

and so Pl[p,] for some i. ■ >. ' 

Finally, if Pl[p] then [p] = P.^1 for some A. By the corollary to 

Theorem 2-31, 

N[p] = INpl = p", 

and so N(PA) = NP • = p”. Hence NPlp”, and the proof is 

complete. 

Theorem 2-37 shows that the primes of P[i?l are to be found among 
the factors of the principal ideals [p]. Only partial information is 
available about the way these ideals decompose, and the derivation of 
most of what is known is too intricate for inclusion here, but 
prove the simpler half of a famous theorem due to Dedekind, which 
states that [p] is divisible by the square of a prime ideal in R[d] if and 
only if p divides A, the discriminant of Rid). 

Theorem 2-38. If P does not divide A, then [p] factors as a product 
of {one or -more) distinct prime ideals. 

Proof: Suppose that P=^|[p], so that [p] = P^-^* Choose an element 

a of PM which does not belong to P^Af, so that p\a but pfa. Since 
p > 2, p\{a^Y for every /3 in 
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For an arbitrary element 7 of define S 7 , the trace of 7 , by 

the equation 

S7 = 7' + • • * + 

where 7 ', . . . , 7 ^"*^ are the field conjugates of 7 . J^y the Symmetric 
Function Theorem, S 7 is in Z if 7 is an integer, and it is clear that 
S(r 7 ) = rS 7 if r is rational. In particular, 

V P 


is in Z, so that S(a/3)^ is in [p]. By the multinomial theorem, if 
/S', are the field conjugates of (3j then 

{S{a(3)y = (a' (3' + • * • + 

^ ia(3'y + • • • + = S(M)^) 

= 0 (mod p), 

and since S{al3) is a rational integer, p\S(aP). 

Now let pi, . . . , pn be an integral basis for R[d]. Then 

4 

a = hiPi + • • • + hnPnt 


where the /t’s are rational integers not all divisible by p, since p\a. 
For 1 < i < n we have 


Let 


S(api) = s(z hjPjp\ = Y. hjSipiPj). 

\i=l / y=i 

d = det \S{piPj)\, 


and let An be the cofactor of S(p,py) in d. Then for A; = 1, 2, . . . , n, 

Z) Ail, X} hjSipiPj) = T, hj A^Sipipj) = dhk. 

*=i y=i y=i i=i 

Since 

pIZ hjS{pipj) 
j 

for each f, it is also true that p\dhk for each k] p therefore divides d. 
Finally, 


d = detlS(p,pj-)l = det = det = A; 

k 

hence p\A. 
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As an illustration of the present theorem, note that in the field 
72 (i), of discriminant —4, we have 

[2] = [1+ ^]^ 

[p] = [a + hi][a - bi], if = p = 1 (mod 4), 

[q] = a prime ideal, if q- = 3 (mod 4). 

Here [1 + z], [a + 5z], and [a — hi] are prime ideals of degree 1, while 
each P is of degree 2. * 

Theorem 2-39. Each ideal [p], where p is a rational prime, splits 
into at most n ideal factors in the integral domain of afield of degree n. 


Proof: If 


then 


[p] = Fl • • • Pr, 

P" = N[p] = NPi • • • NPr, 


and for each z, NP,- > 1. Hence r < n. 


PROBLEMS 

1. In the domain ^[V — 5], put 

^ [3_ 4 + V^], B = [3, 4 - C =[7,4 + V^], 

D = [7,4- V^]. 

Show that AB = [3], CD = [7], AC = [4 + V^], BD =J4 - V-5], 

and that A, B, C, and D are prime ideals. Factor (1 + 2V- 5]. 

2. Let Ri'\/d), where d is square-free, have discriminant A. If q is an 
odd prime dividing A, show that the ideals 



are equal, and that their product is q. Show also that if A ^ even, then 

[2] = [2,'V^d]^ for d = 2 (mod 4) and that [2] = [2, 1 -f Vo]^ if d == 3 
(mod 4). This completes the proof of Dedekind*s theorem, stated just 
before Theorem 2-38, in the case of a quadratic field. 

• ♦ 

2-9. Units of algebraic number fields. We saw in Section 2-4 that 

the units of a quadratic field R(Vd) are determined by the solutions 
of the Pell equation with AT = ±1 or ±4. and it is an easy conse- 

»Compare the remark following Theorem 7-7, Volume I. 
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quence of this relationship and standard properties of Pell’s equation* 

that the group of units of R{\^d) has a basis, consisting of — 1 and 
the fundamental solution e of the appropriate Pell equation. That is 

every unit of can be written in the form ( — where a is 

0 or 1 and /3 ranges over Z. We shall now show that this property is 
not peculiar to quadratic fields, but that in fact the group of units in 
each algebraic number field has a finite basis. (In general, if G is a 
commutative multiplicative group and 5i, . . . , h„i are elements of G, 
they are said to form a basis for G if e\'ery element of G can be repre- 
sented in the form and in every such representation of 

the unit element e of G, the factor = e for 1 < z < w.) This 
theorem, which is due to Dirichlet, can be sharpened by giving 
the exact number of basis elements, but for many purposes, including 
the application to be made in the next chapter, the finiteness of the 
basis suffices. The upper bound which we shall obtain is actually the 
correct number. 

We introduce the symbol to designate the maximum of the 
absolute values of the conjugates of the algebraic number a, and 
denote by K a fixed algebraic number field. 

Theorem 2-40. If a is a fixed positive number, there are only 

finitely many integers a of K such that 



If all conjugates of a have absolute value 1, then a is a root of unity. 

Proof: If [ol < a and deg a = n, then each of the elementary 
symmetric functions in a. and its conjugates is numerically smaller 
than some bound depending only on a and n. If a is an integer in K, 
then n cannot exceed the degree of K, so that there are available only 
finitely many coefficients for the defining polynomial of a, and there 
are, therefore, only finitely many such a’s in K. 

If = 1 for t = 1, . . . , n, then = 1 for all m in Z, so that 
by what we have just proved, a”‘i = for some distinct exponents 

mi and rri^. Hence so that a is a root of unity. 

Theorem 2-41. The group U of roots of unity in K is a finite cyclic 

group. 


*See, for example, Theorems 8-5, 8-6, and 8-7, Volume I. 
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Proof : If f is a root of unity, then 11*1 = 1, and the finiteness of the 
group follows from the preceding theorem. Let the various elements 
Ui of U be primitive i^i-th roots of unity, for ^ = and put 

w — max (wi, . . . , Wt), 

For fixed z, the numbers and are in t/, for every a and 

b in Z. If (wij w) = d, choose a and b so that aw + bwi = d; then 
the product 

2iri{a/wi +6/u>) 2irid/wiW 2irx/{Wi,w) 

6 — € €/ 

is in U. It follows that the lcm of Wi and w does not exceed w, so that 
Wi\w for i = Ij , . . , L Since the powers of 

ro = 

include all dth roots of unity if d\w, it is clear that fo generates U. 

Now let dj of degree n, be a primitive element of Ky so that 
K — R(^)j and arrange the conjugates of d in such an order that 

are real, while are not real. (Note 

that it is not necessarily true that = d.) Then n — ri is an even 
number, say 2 r 2 , and we can further order the nonreal conjugates so 
that and are complex-conjugate, for j = 1, . . . , r 2 . 

If a is any number of /C, the field conjugates of a are such that 
. . . , are real, while and are complex-con- 

jugate for j = 1, . . . , ^ 2 - Of course some of these latter numbers 
may also be real, but in any case 

l^(ri+i)| == fori = 1, . . . , r 2 . (16) 

If €i, . . . , cjfe are units of K, they are said to be independent if the 
relation 

= 1, ai, . . . , a* in Z, (17) 

holds only for ai = * ‘ “ 0. 

Theorem 2-42. Units in K are independent if and only 

if the sole solution of the system 

Xm Ing “0, i — 1> 2, . . . , r, (18) 

m *■! 


in rational integers is Xi 


= Xk - ^ 
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Proof: Suppose that (17) has a solution in which not all the a’s 
are zero. Then the analogous equation with each replaced by 
also holds, so that 




(0|ai . , 






E a™ log 


m 






Conversely, if (19) holds with not all the rational integers Oi, . . . , a* 
equal to zero, then is an integer of K all of whose conju- 

gates have absolute value 1 ; it is therefore a root of unity whose wth 
power is 1, and (17) holds with ai, . . . , a;t replaced by u'ai, . . . , wa^. 
Hence the nontrivial solvability in Z of (19) is equivalent to the 
dependence of ei, . . . , e/.. 

The truth of the theorem will now follow if we can prove that if 
the equations (19) hold with f ^ n -f r 2 — 1, then the 

remaining n — ri — r 2 + 1 equations are also correct. To show this, 
suppose that the first n + r 2 — 1 equations are true, and define 

_ jl for 1 < 2 < n, 

~ [2 for ri -f- 1 < 2 < n. 


Since each is a unit, its norm has absolute value 1 ; 

by (16), 


n 


E log Icm 


ri H-ra 

= E Cj* log 


m 




= 0 . 


Hence 


k n-j-rj n-h', k 

E am E e, logl«m^‘ I = E ei Y, amlog|e„'‘>| = 0, 

»=1 m=l 


SO that 


m = 1 


ri+r2— 1 k 

E e,- E; a„ log = 0 


^=1 


Thus (49) also holds for i = ti + and so, by (16), for 


> 



Theorem 2-43. If the relation (18) holds for some set of real numbers 
Xi, . . . , Xft which are not all zerOy it also holds for rational integers 
. , xjc which are not all zero. 
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Proof: Suppose the hypothesis fulfilled. Since the system (18) is 
certainly nontrivially solvable in rational integers if some is in U, 
it suffices to consider the case that all the units are of infinite order. 
Then each unit separately is independent. Now suppose that the 
units are such that the equations 

log =0, i= (20) 

m “1 

have the single real solution ai = • * • = = 0, while the system 


i: a„.logl6j'>l =0, (21) 

m — 1 


has a nontrivial real solution ai, . . . , aq. Then 2 < g < fc, 0, 
and the ratios axlocqj . . . , aq^\/aq are uniquely determined, since 
otherwise the differences of the respective ratios would provide a 
nontrivial solution of (20). If we can show that these ratios are 
rational numbers, the theorem will result by taking a suitable common 
integral multiple of the numbers 


— for 1 < w < q, 

Xm = \ 


If we put Otmlotq 
imply that 



[ 0 for q < m < k. 

= — /3m for rti = Ij equations 

= 'e ^mlogl€,„^‘^l, I = 1, . . . ,n. 

m 


( 21 ) 

( 22 ) 


Now consider the set of all units 77 with the property that 

log = S "tm log ~ 1, . . . , Tl, 

m =1 


(23) 


for suitable real numbers 71, • ■ ■ , 7,-1- For such an r, the coefficients 
are unique. We caU the set 71. , T,-i of real numbers prap^ 

if ri as defined in (23) with these 7’s, actually is a unit, and if 1 
addition I71I < 1, • • ■ . k-i-il < F If Ti, • • • . "y*-! ^ proper se , 


then 

|logl77^"^|| < x; |logl€m^‘^l|» 

m ssl 

and, by Theorem 2-40, there are only finitely many (say H) proper 
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sets. On the other hand, if 71, , jg-i is proper, so also is 

A'7i - - • ■ , - [A^7-2-i1, 

if A^ is a rational integer. For 

E (A'7m — [A^7,„]) log - log — E log 

m=l 

which is the logarithm of a product of powers of units, and is there- 
fore the logarithm of a unit. Now if any (3,u were irrational, then no 
two of the numbers ~ [A^/i,„l, where .V runs o\er Z, w^ould be 
e(]ual, and we should have infinitely many proper sets. This con- 
tradiction establishes the theorem. 

Theorem 2-44. If €1, . . . , ek are units such that the only real 
solution of (18) is the trivial solution, then there is a rational integer 
M mth the following property: in order that a number ?? such that 

log = E 7m log t = 

m =1 

he a unit of K, it is necessary that all the numbers My,ri he rational 
integers. 

Proof: The hypothesis is that which was used in the preceding 
proof, except that we have replaced g - 1 hy k. Suppose that 
7m = 0 ,/hj where a and b are rational integers with 6>0 and (a, 6) = 1 
and w is one of the integers 1, . . . , Then Nym - [Nym] assumes 
t e 6 values 0/6, 1/6, . . . , (6 — l)/6, so that b < H, where H is the 
number of proper sets. Hence, 6|^f!^ and we can take M = H\, 

Theorem 2-45. The group E of all units of K has a finite basis, the 
number of basis elements of infinite order being at most r. 

Proof. The system (18) of r linear homogeneous equations in k 
unknowns is certainly nontrivially solvable in reals if k > r, and it 
o ows from Theorem 2-43 that there are at most r independent 
units in K. Let k be the exact maximal number of independent units, 
and let-.^i, • . , be such a set. Then by Theorem 2-44, for every 
unit r) ol K there are gi, . . . , g* in Z such that 

7/1=1 


i = 1, . . . n. 
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By the second part 
that 


of Theorem 2—40, and Theorem 2—41, it follows 

r,M = ,^<71 . . . 


so that ti, ... ,tk,^o form a basis for the group of Mth powers of units. 
Now define the numbers io, ■ ■ ■ < h by fbe equations 

where an arbitrary but fixed Mth root is taken in each case. The 
numbers U may not lie in K, but they form a basis for a group Em of 
complex numbers, and Em clearly contains E as a subgroup. The 
theorem is therefore a consequence of the following general principle. 

Theorem 2-46. If G is a commutative group having a basis of n 
elements, every subgroup of G also has a basis, of at most n elements. 

Proof: Suppose that Xi, . . . . Xn is a basis for G, that S is a subgroup 
of G, and that some X, actually occurs in the representation of some 
element s of S. Let 7. be the set of all exponents which occur on X,- 
in the representations of the various elements of S. If a is in 7.-, so 
is ka for k in Z, and if a and o' are in 7., so is a - a . Hence 7, is an 

ideal in Z, and is therefore a principal ideal, say 7,- = [a,*L 

We now proceed by induction on n. If n = 1, then Xi“‘ is a basis 
for S, by what we have just proved. Suppose that the theorem is 
true for every commutative group with n - 1 basis elements, and 
suppose that G has n basis elements, say Xi, . ■ • , X,,. Let S be a sub- 
group of G. If every element of S can be written in the form 

X «l ... X fln-l 
A1 , 


the result follows from the induction hypothesis. Otherwise, suppose 
that In = [a], and let X be an element of in whose basis representa- 
tion x” occurs with exponent a. Then for every s in S there exists a 
b. in Z such that ax'” has a representation 

ax'- = Xi“i . . . X„_i“"-b 

4 

The set of numbers of the form aX*" is therefore a subgroup of the 
group G' which has Xi, . . - , ^n-i as a basis, and by the induction 
hypothesis this subgroup also has a basis, of at most n - 1 elements. 
This latter basis, together with X, clearly constitutes a basis for b. 
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Seciion 2-4 

The complete tabulation of Euclidean domains is the work of many 
writers. K. Inker! {Annales Acadcmiae Scientiarurn Fennicae, Series A 
(Helsinki) I, Mathematics-Physics, 41, 35pp. (1947)) supplied the last link 

in a chain of theorems which together show that if d > 100, then R{\/d) 
is not Euclidean. E. S. Barnes and H. P. F. Swinnerton-Dyer (.4c^a Maihe- 
matica (Stockholm) 87, 259-323 (1952)) showed that, contrary to what 

had been believed, i2(V97) is not Euclidean. P. Varnavides {Proceedings 
Konink. N ederlandsche Akadeynie van Wetenschappen, Series A (Amster- 
dam) 55, 111-122 (1952) or Indagaiiones Mathematicae (Amsterdam) 14, 

111-122 (1952)) showed that the values of d listed in the text yield 
Euclidean domains. 


Section 2-9 

The material of this section is adapted from E. Hecke, V orlesungen iXber 
die Theorie der Algebraischen Zahlen, Leipzig: Akademische Verlags- 
gesellschaft m.b.H., 1923; reprinted by Chelsea Publishing Company, 
New York, 1948; pp. 116-131. It is proved there that the upper bound 
obtained in the text is exact. 


CHAPTER 3 


APPLICATIONS TO RATIONAL NUMBER THEORY 


3-1 Introduction. As was suggested in the preceding chapter, 
there are many problems in rational number theory which are most 
naturally treated in the more extensive framework of an algebraic 
number field. Chief among these are various Diophantine equations; 
indeed, it was the study of Fermat's equation, x" + y” = 2 ”, n > 
which was originally responsible for the development of ideal theory. 
While this approach has not led to a complete verification of Fermat's 
conjecture in all cases, it has produced results which would probably 
never have been obtained using rational methods alone. In the first 
part of this chapter we will discuss some results of this kind due to 
E. Kummer. Here heavy use will be made of ideal theory. 

The latter portion of the chapter is primarily concerned with a 
theorem due to B. Delauney and T. Nagell, which asserts that the 
cubic analog of Pell's equation, 

X® + dy^ = 1, 

has at most one solution in nonzero rational integers x, and com- 
pletely characterizes this possible solution. (In the next chapter we 
shall prove a less precise result about the general equation x'‘+d?/”= 1, 
n > 3.) Use is made here of the insolvability in Z of 

+ = 2 ^ 

but otherwise the two parts are mutually independent. 

3-2 Equivalence and class nmnber. We say that the ideals A 
and B of R[&] are equivalent y and write A ^ there are nonzero 
elements a. and p of R[^] such that 

[cc]A = [g\B. 

It is easily seen that is an equivalence relation. Moreover, if 

82 


/ 


3-2) EQUIVALENCE AND CLASS NUMBER 83 

A ~ Z? and C ~ D, then AC ~ BD, and if AC ~ BC then A ~ B. 

Theorem 3-1. All principal ideals are equivalent. Any ideal 
equivalent to a principal ideal is principal. 

Proof: The first statement is trivial, since 

W[/3] = [/3][a]. 

If A ~ [a], then for some (3 and 7 , 

[/3]A = [a][ 7 ] = [a 7 ], 

and hence 

[d]|[a7], 
ay = ;35, 

[d]A = [a7] = m[5], 

A = [5], 

Since equivalence is an equivalence relation, the ideals of Z?[d] can 

be separated into equivalence classes in the usual way. The number 

h of such classes is called the class number of the field ; according to 

Theorem 3-1, = 1 if and only if every ideal is principal, i.e., if and 

only if is a unique factorization domain. We shall now show that 
h is ahyays finite. 

Theorem 3-2. There is a positive constant c, which depends only 

on the fields such that each ideal A divides a principal ideal AB for 
which 

< cN^. 

Proof. Let pi, . . . , p„ be a field basis, and let pi^®^ o 

{s — . . . j n) be the field conjugates of these numbers. We shall 

show that the theorem is true with 

c = n ( 1 pi'*>i + ... + |p„(»)|). 

^ ^ X 

Let A be an arbitrary ideal, and let k be the greatest rational 
integer hot exceeding (NA) so that fc" < NA < (fc + 1)" Then 

A range independently over the integers 0, 1, . . . , A: there 


V 
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are determined (k + 1)” different integers 

iiPi + * * * + Upn, 

and two of them must be congruent modulo A : 

y^lPl + • * * + W„pn = ViPi + • • • + VnPn (mod A). 

Thus 

Ot = {ui — Vi)pi + * • • + (Un — Vn)pn 

is in Ay so that and 

N[a] = lNa| = n f Z (Mi - < n E |m,- - Vi\ ■ |pi<*’l 

3=1 \i=l /I «=1 * = 1 


< n i: /clPi^”'! = ck^ < cNyl. 

3=1 t=l 

Theorem 3-3. The class number of any algebraic number field is 
finite. 

Proof: It suffices to show that in each class there is an ideal B such 
that NB < c, by the corollary to Theorem 2-35. Let C be an arbi- 
trary ideal of a given class, and determine A so that AC is principal. 
Then by Theorem 3-2, there is an ideal B such that AB is principal 
and ^NAB < cNA. Then AB ^ ACy B and 



'NAB 

NA 


< c. 


Theorem 3^. If h is the class number y the hth power of any ideal is 
principal. 

Proof: If ^ 1 , . . . , is a complete system of representatives of the 
various classes, and A is arbitrary, then AAi, , A Ah is another 
such system. Hence 

Ai — • Ah^ ^^1 * ' ' AAh = 
so A^ [1] and A^ is principal. 

Theorem 3-5. If P ® rational prime and p\hy then A^ B^ 
implies A B. 

Proof: Since p\hy there are positive x and y in Z such that 

px — hy = 1. 
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From the fact that ~ we have 

[a]AP = mB’’, 

[ocYA^'>A = 

and by Theorem 3^, A ~ 5. 

Theorem 3-5 shows that the primes which do not divide h enjoy a 
property not shared by other primes. This is of great importance in 
the investigation of Fermat’s equation. 

PfiOBLEMS 

1 . Let a and P be algebraic integers, not both zero. Show that there is an 
integers such that, first, 5|a and 8\p (in the sense that a/5 and 0/5 are again 
integers), and, second, for suitable integers f and ??, 5 = of -f /St?. Show that 
this GCD is unique up to an algebraic unit (i.e., an integer which divides 1). 
[Hint: First settle the case a0 = 0, In the other case, let K be an alge- 
braic number field of class number h, containing both a and jd. Then 
[a, 0]^ = [ 7 ], for some 7 in K. Let 5 be an integer such that 5^ = 7 , and 
show that the equation [a, 0]^ = [ 7 ] still holds when [a, 0] and [ 7 ] are 
interpreted as ideals in K (5). Deduce that [a, 0] = [ 6 ] in K(5).] Does the 
Unique Factorization Theorem hold in the domain of all algebraic integers? 

2. Let K be an algebraic number field. Show that to each ideal ^ of A' 
there corresponds an integer a (not necessarily in A) such that the elements 
of A are exactly those integers of A which are divisible by a. 

3-3 The cyclotomic field Kp, Let p be an odd prime, let 

xP — 1 

= — — r = + xP-2 H + 1. 

X — 1 * 

and let f so that the zeros of <l> are . . . , the 

primitive pth roots of unity. The field A(f) = = * . . = 

= Kp is called a cyclotomic field. It is clearly of degree 

p — 1 at most. We put 1 — {* = tt. (The fact that the symbol tt is 

used for two different numbers should occasion no confusion; the 

number tt — 3.14159 . . . will occur only in the argument of the ex- 
ponential function.) 

Theorem 3-6. Jn Kp^ the ideal [p] }ias the factorization 

[p] -= [Tr]P-i ; 
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[tt] is prime, and N[7r] = p; ^ is irreducible, and is of degree 

p - 1. 

Proof: Since f is an integer of Kp, so is 


1 1 I I I 

y— - i + r + '" + r , 


1 < r < p — 1 ; 


if now an / is chosen so that rr' = 1 (mod p), then = f, so that 


= 


1 - r 


rr' 


1 - f 


- = 1 + + • • ■ + f 


r(r<-l) 


is also an integer. Hence e, is a unit of Kp, and 

p = $(i) = "n (1 - n = (1 - "n = ed - 

r=l 

where e is a unit. It follows from this equation that [p] = [7r]'’“S 
and also that Ntt = p. By Theorem 2-39, deg Kp > p - 1, so that 
[t] is prime, deg Kp = p — 1, and <J> is irreducible. (For a different 
proof of the irreducibility of #, see Problem 1, Section 2—2.) 

Hereafter, we designate [tt] by P. 

Theorem 3-7. Writing A(l, f, . . . , = ACf), we have 

A(r) = 


Proof: From the representations 


p— 1 

$(a:) = n (x - r) 

r«l 


xP - 1 
X - 1 


we obtain 




n (f* 

l<r<p-l 

f - 1 




1)j-«Cp— 1) _ 

(f - If 


- 1) 


p 


f*(i - n 


Since 


A(f) - 


1 f 
1 t 


vp— 2 
J' 2 (p- 2 ) 


1 r 


P —1 


» • 


j-(p— 2)(p— 1) 


n (r' 

i<f <«<P“1 


- f)", 
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A{r) - (-ij^ 




. /» — I 

{/)— 1)(;) — 2) jj 

.S=l 1 

p-i 


= (- 1 ) 


hip- 


S “ 1 

l)pP-2 


Theorem 3-8. The numbers 1, f, . . . , /orm integral basis 
far Kp, so that 

A = A(r) = 

Proo/; Suppose that a is an integer of Kp, and that 

a = To + + * • * + 

where the r^s are rational. Then for fc = 0, — 1, 

= E 


and since the trace function is clearly additive, 

sit^cc) = e' s(r,-r>+") = e' r>s(r^-r"). 

;=0 j =0 

Solving this system of equations for the numbers ry, we obtain 

a determinant in a and ^ 

== det is(r^r*=)i 

But as we saw in the proof of Theorem 2-38, det |S(f^f*')l = A(f); 
since the determinant in the numerator has the rational value ryA(f), 
and is clearly an integer of Kp, it is a rational integer. Thus a can be 
written in the form 

Co + CiT + • ■ ‘ + Cp_2r^~^ do + diT T + • • • + dp_27r^~^ 

“ pp— 2 

where the c’s, and therefore also the d’s, are in Z. Since a is an 
integer, 

p\ {do + diTT -j- dp_27r^^^)> 

and since Pl[p], 


P\[do + diTT + . . . + 
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so P|[(^o]- It follows that NP|N[do], and finally p|do* This 

argument may be repeated p — 2 times, to show that p\dfc for 
k = 1, . . . , p — 2, so that 





where the e’s are rational integers. Repeating the entire argument 
p — 3 times, we see that 

oc = fo + /iTT + • • • + /p_27r^^, 

where the fs are in Z. Hence 1, tt, . . . , form an integral basis 
for Kp. But from the equations 

TT = 1 - f, f = 1 - TT, 

TT^ = 1 — and = 1 — 27r + tt^, 

» ♦ 

» • 

• • 

we see that A( 7 r) = a^A(r) and A(J') = a^A(7r), where a is a certain 
determinant with binomial coefficients as entries. Hence = 1, 
A(J-) = A( 7 r), and 1, f, . . - , also form an integral basis, by 

Theorem 2-14. 

Theorem 3-9. If a is an integer of Kpj there is a rational integer , a, 
such that 

= a (mod P^). 

Proof: Since NP = p, the incongnient numbers 0, 1, . . • , P — 1 
form a complete residue system modulo P , so that for suitable b in Z, 

ot = h (mod P). 

But 

^ -hv ^ 

r=0 

and since f ^ 1 (mod P), 

p-i 

= n (a - b) = 0 (mod P^), 

r-O 

SO that we can take a = b^ ^ 

If P\[ol] and a = a (mod P^) for some a in Z, then a is said to be 
primary. 
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3-31 CYCLOTOMIC FIELD Kp 

Theorem 3-10. If ttIq:, then for some positive rational integer /, 
is primary. 

Proof: For suitable a and b in Z, 

a ^ a + bw (mod 

and TT-j-a, so that p\a. Choose / so that 

af = b (mod p). 

Then since 

f/ = (1 - tt)^ = 1 - /tt (mod P^), 

we have 

= (1 — 'r/)(a + b7r) = a + T{ — af + b) = a (mod P^). 

We now investigate the units of Kp. 

Theorem 3-11. The only roots of unity in Kp are the numbers =hC» 
0 < r < p. 

Proof: The roots of unity are the numbers 

^2‘iritlTn 
^ ) 

where t and m are rational integers, and {t^ m) = 1. If such a number 
is in Kp, and if tt' = l(mod m), then also 

Im _ ^2iri/m 

is in Kp. The numbers mentioned in the theorem are the (2p)th 
roots of unity, so we need only show that is not in Kp if m|2p. 

If ?n|2p, then either 4\m, or some odd prime q 9^ p divides m, or 
p'^\m. Suppose that jg ij^ 

If 4\m, then 


is in Kp. But then so are I + i and 1 — z, and 


[1 + i] = [1 - i] 

contrary to Theorem 2-38. 

If q\m, then 


and 


[2] = [1 + , 




is in Kp. But then the reasoning used in the proof of Theorem 3-6 
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shows that 


(?] = [1 - 


again contradicting Theorem 2-38. 

If then 

^ = e2Tt7p2 

is in Kn. But ^ is a zero of 


xP" - 1 
x^ - 1 


and 


= + — h + 1 = n 

1 < m< p2 
p^m 


p = n (1 - r). 

1 < m < p2 
p+m 


(x - D, 


As before, the factors in this product are associated, and we get 

[p] = [1 - 

contradicting Theorem 2-39. 

Theorem 3-12. Each unit e of Kp can he written in the form 

where g is a 'positive raiional integer and rj is real. 

Proof: Express € in terms of the integral basis 1, f, . . . , : 

^ = /(f), 

where / is a polynomial with rational integral coefficients. Then 
clearly = /(f*) is also a unit, since Nc = ti • • • ep_i = ±1- Also, 

= /(f^-) = /(r-) = /(n = 

where the bar denotes the complex conjugate, so that CsCp—s — >0, 

and 


§(p-i) 

Ne = n €aep_, > 0, 




so that Ne = 1. 
Since ep—f. 


p— « 


= 1 . 
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THE CYCLOTOMIC FIELD Kp 



The polynomial 

P-i / 

n [x 

8=1 \ 



has coefficients in Z, so, by Theorem 2-40, €i/ep_i is a root of unity, 
and by Theorem 3-11, 

€1 = =b 


Since either m or p + r?? is even, and since 



we can write 

€i = 

The proof will be complete if it can be shown that the plus sign is 
appropriate here, since then the quantities and are 

simultaneously equal and complex-conjugate, and are therefore real, 
so that € = 6i = 

To show this, choose a from among 0, 1, . . . , p — 1 so that 


Then 


^ ^ a (mod P). 


t — a 

/u 

TT 

is aji integer in /Cp, as is 





Since 5f = 1 — ^ is an associate of tt, it follows that 

1 — a 

TT 

is an integer of Kp^ so that 


Thus 


1 = = f (mod P). 


€ 




— = (mod P). 

€p-l 


and 
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If the minus sign obtains in the equation, we have 

_j. 2 b = j.2t, (modF), 

= 0 (mod P), 

P\[2f^\, 

NF|2P-\ 

contrary to the fact that NP = p. The proof is complete. 


PROBLEMS 


1. Ijct p and q be distinct odd primes, and let f be a primitive pth root 

of unity. 

(a) Show that 

p-i i(p-i) 

p = n (1 - f^) = n (r-r")'. 

a=l 


(b) Show that 

(fa — f““)« = (mod q). 



Deduce that 




I(p-i) 

(_ l)^(p-o*I(«7-i) Y1 

a =1 



(mod q). 


(d) Show that the second factor on the right side of the last congru- 
ence above is ( — 1)**, where m is the number of numerically smallest residues 
(mod p) among q, 2q, . . . , hiP - i)<! which are negative, and so obtain a 
proof of the law of quadratic reciprocity. 

2. For an odd prime p and a positive integer h, put 


^h{x) = 


- 1 


ZP 


h-l 


= X 


A-1 


(P-1) _|_ 3.p*->(p-2) -f- . . . 4- + 1, 


- 1 


and let f be a zero ofO* and Xp. = P(f). Then the degree of is at most 

^(p*) = t. Put 1 - f = •*•. and [ir] = P. 

(a) Show that in Kp*, the ideal [p] has the factorization P‘, P is prime, 

NP = p ^h(x) is irreducible and Kph is of degree t. 

(b) Show that A(f) = (-l)i‘<‘-i) IHint: Notice that 

X _ fP*-* is a zero of 4>i(l - x), an irreducible polynomial of degree 
p - 1 with leading coefficient (-l)>-> and constant term p, and deduce 

thatNd - r'^-) = ((-D’^'P)”*'*-] ^ w P nH if 

(c) Show that if L is any pnme ideal in Ky* different from P, and if 
fo s (mod L), then a=b (mod p*). 
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Fermat’s equation 


3-4 Fermat’s equation, 
first the equation 


For the sake of completeness, we consider 

+ = ( 1 ) 


for the cases n = 2, 4, and 3. When these have been disposed of, 
Fermat’s assertion would be proved if it could be shown that (1) 
has no solutions in rational integers x, y, z, with xyz 0, if n is a 
prime larger than 3. 

The proof that (1) is impossible when n = 4 depends on the 
following theorem, which characterizes the solutions of (1) when 
n = 2. 

Theorem 3-13. A general 'primitive solution (z.e., a solution in 
which {x, y, z) = l) of 

+ 2 /^ = 2 ^, y €ven, X > 0, 2 /> 0 , 2>0 


is given by 

X = — b^, y = 2ab, z = + b^, 

where a and b are -prime to each other and not both odd, and a > b > 0. 


Remark: It is clear that one of x and y must be even, since other- 
wise x^ + y^ = 2 ^ z'^ (mod 4). There is no loss in generality in 
assuming that it is y which is even. 

Proof: Suppose that x^ + y^ == 2 ^ gince (x, y,z) = 1, also 
{y^ z) = so that (2 — 2 /, 2 + 2 /) = 1 or 2. But z is odd and y is 
even, so that (2 — z/, 2 + 2 /) = 1* Hence, from the equation 

x^ = (2 - y){z + y), 


we deduce that z ^ y and z y must be squares, since they are 
positive. Now if t and u are fixed integers of the same parity (both 
odd or both even), there are integers a and b such that t = a + b and 
u = a — b. Hence we can put 

z — y = {a — b)^, z y = (a + 6)^, 

which gives 

(a — b)^ + (a + 5)^ 2 . u 2 

z — ^ ' — a " 1 ” 0 , 



(a + b)^ — (g — b) 

2 


2 

- = 2abj 

2 - b^. 


X = (a — b) (a + b) = (X 
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Since (z — z + x) = (2a^, 26^) = 2, we must choose a and b so 
that (a, b) = 1, Since x is odd, a + b must be odd. Since y > 0, 
a and b must have the same sign, and since x > 0, \a\ > l^l- Since 
the pairs a, b and —a, —6 give the same solution, we can suppose 
that a > 6 > 0. 

Theorem 3-14. The equation + 2/^ = solvable in non- 

zero rational integers. 

Proof: It suffices to show that there is no primitive solution of 
the equation 

x^ + y^ = z^ 

Suppose that x, 1 /, and z constitute such a solution ; with no loss in 
generality we can take x > 0, ?/ > 0, 2 > 0, and y even. Writing 
the supposed relation in the form 

{x^? + 


— + b^ f 


If a were even, 


we have from the preceding theorem that 

X* = = 2ah, i 

where {a,h) = 1 and exactly one of a and h is odd 
we would have 

\ = — b^ = —1 (niod 4), 

so 216. We apply Theorem 3-13 again, this time to the equation 
x^ + 6^ = a^j and obtain 


X 


= = 2pg, a = 


where (p, g) = 1, p > g > 0, and not both of p and g are odd. From 

2/^ = 2a6 

we have 

= 4pg(p2 -I- ff). 

Here p, g and p^ + g^ are relatively prime in pairs, so each must be 
a square : 

p = r , 

from which 

= P. 


‘2 g = p^ -h g^ = 


Now 


x = r" 


^ 4 ^ y — 2rstj z 


= + 6^ = r» + 6r^s* + s% 


4^4 


.8 


3-41 

SO that 
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2 > (r^ + = t\ 

or t < z~*. It follows that if one solution of j'* + = 2 ^ were 

known, another solution r, s, I could be found for which rsl ^ 0 and 
0 < i < 2 ^ But Uiis would give an infinite decreasing seciuence of 

positive integers. 

The case n = 3 is rather more difficult, since it is necessary to work 

in the quadratic field /V 3 = where f = (—1 + i\^3)/2 is a 

primitive cube root of unity. Not all the complications of the general 
case are present, however, since there is unique factorization of the 
integers of K 3 , as the following theorem shows. 

Theorem 3-15. Gtven any two integers a and y of Kz, of which 
-y 5 *^ 0, there are integers k and p such that 

a = xy Pt 0 < Np < N 7 . 


The integers of Kz therefore form 0 Euclidean domain. 

Proof: Since 1 and f form an integral basis for Kz, we can write 

a _ a + _ (g + bj") (c + 

y c + d^ c^ ~ cd + d^ 


— R *Sf, 


where a, 6 , c, and d are rational integers, and R and S are rational 
Choose X and y in Z such that 

1 


\R-x\< t - yl < 


then 


a 


(x + yf) 

7 


= (/? - x)^ - (R - x){S - y) + (S - y)^ < - 


I * I 

Hence, ^ = x + and p = a — xy, then 

Np < IN7 < N7, and Np = pp = 1 p1^ > 0 


Theorem 3-16. The equation 

C + ’j® 4- = 0 


( 2 ) 


has no solution in nonzero integers of /C 3 . It therefore has no solu- 
tion in nonzero rational integers. 


Proof: 
prime tt 


We first note that one of and i? must be divisible by the 
= 1 ' f, if ( 2 ) holds. For put 


P; 


y + = 


d+ T. 
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Then a simple calculation, using (2), shows that 

(p + (7 + t)^ = 24p(7T. 

Since the expression on the right side of this equation is divisible by 

3 = -rv, 

the left side must be divisible by tt, and therefore by tt^. Returning 
to the right side, it follows that one of p, a, or r must be divisible 
by TT. If ttIp, then ttK?^ + t?^), so 7r|i?^, and finally ttIi?. 

If there were a common factor in two of v, and t?, it would also 
occur in the third, and could be divided out; so suppose that (2) 
holds, that 77, and t? are relatively prime in pairs, and that 7r|t?. 

By Theorem 3-10, we may suppose that an appropriate power of 
f has been introduced into ^ and 77 so that 

^ = 1, 77 ^ —1 (mod 3), 

which we express by putting 

? = 1 + 3q:, 77 = — 1 + 3^, 


where a and 0 are integers of K^. Put 


A = 


« + rn 


B = 


n + v 


c = 


(Hi + v) 


TT 


TT 


TT 


these numbers are integers of /Ca, since 

A = 1 H (a + 

TT 

5 = -1 + - (fa + ^), 

TT 


Moreover, 


C = - (a + ^)f*. 

TT 


A + B + C = 0, 

= (v) 


t = -fA + fB, 


= f*A - fB. 


(3) 

(4) 

(5) 


From (5) we see that {A, B) = 1, since otherwise f and ft would 
have a common factor. From (3), also (,A, C) = (B, C) — 1- 


3-5] rummer’s theorem 97 

It follows from (4) that .4, B, and C must all be cubes, say 
A — B = C — and 

+ x" + vi-® = 0. 

Now d. = l, B=— I, C = 0 (mod tt), 

so that from (4), xp contains a smaller power of tt than does i?. 

Repeating the argument a sufficient number of times, we would 
arrive eventually at a solution of (2) in which no \ ariable is divisible 
by TT, which is impossible. 

3-5 Kummer’s theorem. If p is an odd rational prime, and its 
associated cyclotomic field Kp has class number /i, then p is said to 
be regular if p\h. According to Theorem 3-5, if p is a regular prime 
and A and B are ideals in Kp such that A^ ^ then A ^ B. It 
was this essential property of the regular primes which enabled 
Kummer to prove that Fermat’s conjecture is correct for all regular 
primes. (Unfortunately, there are infinitely many irregular ones.) 
We shall not be able to prove Rummer’s theorem in its entirety, but 
shall have to assume without proof a difficult preliminary result. 
We can, however, prove the following theorem. 

Theorem 3-17. If p is regular, the equation 

+ y^ + z^ = 0 (6) 

has no solution in rational integers x, y, z for which p\xyz. 

Proof: Suppose that the theorem is false, and that .t, y, and z 
satisfy all the requirements. We can assume that (x, y) = 1 and 
p > 3; as usual, f is a primitive pth root of unity, and P = [1 — f). 
From (6) we obtain 

p-i 

n (x + ry) = -2^ 

m =0 

SO that 

p-i 

n[x + ry] = [zr. (7 ) 

m =0 

Now no two of the factors on the left have a common factor. For, 
if Q is a prime ideal such that Q|[x + and Q\[x + for 

mi < W 2 , then 
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and hence Q\P[y]. But from (7), Q\[z], so Q 9^ P (since p\z)) hence 
Ql[?/]. But then also Ql[x], and we deduce that and 

which is contrary to the assumption that (x, xj) = 1. 

It follows that each factor on the left side of (7) is the pth power 
of an ideal. If 

[x + fy] = 

then ^ [1] = [1]^, so that by Theorem 3-5, A itself is principal, 
say A = [«]. Then 

+ ^y] = = [^^]* 

Hence 

X + ty = 

where 6 is a unit of Kp, Using the canonical form for units in Kp 
obtained in Theorem 3-12, we have 

X + 7 0 < g < P — I, 

where v is real. By Theorem 3-9, since [p]\P^y 

= a (mod [p]) 


for some a in Z, so that 

X + fP = fV (mod [p]), 

where (r is a real integer of Kp. The complex conjugate 

{■“"^(x + rp) — O' 

V 


of the integer 


is also a field conjugate, and is therefore also an integer. Smce 
p = p and a = o-j we have 

ff ^ (x + iy)i~° (mod [p]), 


and 


^ (x + f (mod [p]), 


so 


that 


xr^ + pf 


I'-ff — 


zf® — = 0 (mod [p]) 


( 8 ) 


Two of these exponents must be congruent modulo p. For suppose 
that they are all distinct, and put 

p P p V 
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Then has a representation in terms of distinct elements of an 
integral basis, the coefficients not being divisible by p. But since 
/3 is an integer, pd also has a representation in which the coefficients 
are divisible by p, and this is contrary to the definition of a basis. 
We conclude that g must have one of the values 0, L or (p l)/2 
(that is, 2^ = 1 (mod p)). 

If ^ == 0, the congruence (8) gives 

= 0 (mod [p]), 

whence, since — 1 is an associate of tt, 

P^MP. P\lyl p\y, 

which is false. If ^ = 1, then (8) yields 

x{l — = 0 (mod [p]), 

which implies that p|x, which is also false. Finally, if g = (p + l)/2, 
then from (8) we get 

(x — y)'rr = 0 (mod [p]), 

which gives 

X = y (mod p). 

Interchanging y and 2 in equation (7), we deduce that also 

X = z (mod p). 

But then equation (6) implies that 

-\-y^ + = Sx^ = 0 (mod p), 

which is false since p > 3 and p\x. Hence the theorem is not false. 

Because of its methodological interest, we deduce the general 

Kummer theorem from the following lemma, whose proof is too long 
for inclusion here : 

Kummer s lemma. Let p he a regular prime. Then if e is a unit 
of Kp and a is a rational integer such that 

€ = a (modPP), 

then e is the pth -power of another unit of Kp. 

This is a partial converse of Theorem 3-9. Using it, we can 
generalize Theorem 3-17 in two ways; by allowing x, p, and z to be 
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integers of Kp instead of rational integers, or by dropping the restric- 
tion that y\xyz. 

Theorem 3-18. If p is a regular prime, the equation 

= 0 

has no solution in nonzero integers x, y, z of Kp for which Tr\xyz. It 
therefore has no solution in nonzero rational integers x, y, z for which 
p\xyz, and therefore {by Theorem 3-17) no nonzero rational integral 
solutions. 

Proof: We first show that the equation 

x^ y^ = Tr\xyz' , e a unit of Kp, (9) 

has no nontrivial solution if w = 1. Equation (9) is a generahzation 
of the equation obtained from (6) by supposing that z = z'w^, where 
TT-fz^, 

We may suppose that x and y have no common numerical factor, 
since it would also occur in z and could be canceled out. (Notice that 
it cannot be assumed that the ideals [x] and [y] are relatively prime, 
since [x, y] may not be principal. ) We may also suppose that x and y 
are primary, since they may be multiplied by appropriate powers of f 
without affecting (9). If (9) is written in the form 

n' {x + ry) = (9') 

m =0 

it is clear that at least one of the factors on the left, say x + f'y, is 
divisible by ir. Since, however, the differences 

(x + f*2/) - (x + ry) = (f* - ny 

and , . 

rcx + ry) - + ry) = -(r - 

are also divisible by tt, each factor on the left in (9') must be divisible 
by IT. If two factors were divisible by tT, we would have 

(f* - r)y, 

TT^le^Try (c' a unit), 

^ly, 

and similarly irlx, contrary to assumption. On the other hand, since 




:r + V — 0 (mod P^). 

Thus the total number of factors of tt on the left side of (9') is at least 
p + 1, so that u > 1. 

Now rewrite (9^ as 

p — 1 

n [x + r?/] = (9") 

m —0 

Any common factor different from P of two ideals in the product on 
the left side of (9") must be a factor of both [x] and [y], and therefore 
of [z']. After dividing out every such common factor, as well as one 
factor of P from each ideal, the ideals remaining on the left are pair- 
wise prime, and their product is a pth power; therefore each factor 
separately is a pth power. 

Combining all these results, we can write 

[X + Xj] = PP(«-1)+1 JgP 

[x + ry\ = P D, 

where D = [x, y], and Jo, J\y ■■■, Jp-i are certain ideals not divisible 

y P. If we put trn = p(m — 1) + 1 or 1, according as w = 0 or 
m > 0, we have, for m ^ 

[x -\- rylP'”* Jr^ D = [x-\- ry][l + = PJ^P 

since P is a principal ideal, it follows that 


Jm^ D ~ JjP 

so that ~ J p, and by Theorem 3-5, ~ Ji. Thus integers 

7m and 6^ (which are not divisible by tt) exist such that 



Raising both sides of (10) (with m 
multiplying through by D ppt'^-D+i 


= 0) to the pth power, and then 
, we have 



) 
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[yo]^D JqP = dp Ji^ ' 


W][x + y] = [x + f2/] 

D P[y 2 rJ 2 ^ = D PlS^r 

[x + r^?/][72’’] = [a: + ^y][h^], 
70 ^ (x + 2/) = ei(x + 


72 ^ (x + f^2/) = tzix + ^y)h^, 
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where €i and ^2 units. 
We now use the identity 


{x + f^2/) + (x + y)t = (x + r2/)(l + r). 


We multiply through by 70^72^, and in the resulting equation replace 
the left sides of equations (1 1 ) by the right sides. After canceling the 
common factor x + fy, there results 


62(7082)*’ + 6lf^*’^““'’(728o)*’ = (1 + f)(7072)*’. 


Since *1, *2, r, and 1 + f = (1 - r")/(l - D are units, this equation 
is of the form 

fP + eav” = (12) 


where €3 and *4 are units and ir\^. By Theorem 3 9, 

|P = ai, = 02 (mod P^), 

where Oi and 02 are rational integers; since u> 1, (12) gives 

oi + 6302 = 0 (mod F*’). 

Since ir\v, also ^02, so that p|a2. Choose 03 so that 0303 = 1 (mod p^) ; 
then 

a2a3 = 1 (mod P^)i 
aias + €3 = 0 (mod P^). 


By Kummer's lemma, €3 “ fl-ud (12) becomes 

e + (661?)*’ = *4irP^“-*’.?*’, 

which is an equation of the form (9) with u replaced by u - 1 • 
Repeating the argument u - 2 times, we would have a solution of (9) 

with u = 1, which is impossible. 


3 - 6 ] THE EQUATION + 2 ^ if 103 

Before lea\dng the subject of Fermat’s conjecture, it might be of 
some interest to mention certain other facts known about it. We 
consider only the solvability of equation (6), 

+ if ^ 0 , 

in Z. 

It was proved by Wieferich in 1909 that if (0) holds in integers 
X, y, and 2 such that p^xyz (the so-called Case I), then 

2^“' ^ 1 (mod?>2). 

Later investigators have shown that in Case I, 

^p~i ^ j (modp^) 

for every prime g < 43; J. B. Rosser used this fact to show that 
there are no solutions in C'ase I for p < 41,000,000. D. H. and Emma 
Lehmer later extended Rosser’s method to prove Fermat’s conjec- 
ture in Case I for p < 253,747,889. This in turn implies that if 
there is a solution in Case I, it must be that log log e > 23. 

. Without the restriction to Case I, Theorem 3-18 disposes of the 
regular primes. Kummer also found criteria to handle the irregular 
primes less than 164; this was pushed on to all p < G19 by H. S. 
Vandiver and his collaborators, and quite recently D. H. and E. 
Lehmer and Vandiver have used high-speed computing techniques 
to settle the problem for all p < 2000. It turns out that of the 302 
primes less than 2000, 118 are irregular; while it is not known that 
there are infinitely many regular primes, there is nothing in the 
limited data available to indicate that there are only finitely many. 

3-6 The equation + 2 = For the remainder of this chapter 
we shall be primarily concerned with the cubic analog of Pell’s 
equation. At one point in the argument, however, we shall need the 
following auxiliary result. 

Theorem 3-19. The only solutions in Z of the equation 

x^ + 2 = (13) 

are x = ±5, y = 3. 

Proof: Following Euler’s idea, we make use of the arithmetic of 
the quadratic field 2). By Theorem 2-16, the integers of this 
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field are of the form a + fev— 2, where a and h are rational integers. 
By a proof exactly paralleling that of Theorem 3-15, it can be shown 
that they form a Euclidean domain : given a, b, c, d in Z, with cd ^ 0, 
there are e, /, g,h '\n Z such that 

a + = (c + d\/^){e +/V^) + (^ + hV^), 

+ 2h^ <c^ + 2dP. 

It follows that R{\/ —2] is a unique factorization domain. 

We first show that if x and y satisfy (13), then x + a/ —2 and 
X - are relatively prime. It is clear that 

(x + V^, X - V^)\-2V^, 

and since —2^—2 == (a/— 2)^ and a/— 2 is prime in the domain (by 
Theorem 2-39), it must be that (x + a/— 2, x — a/^2) = (a/— 2)”*, 
0 < w < 3. But if X + a/— 2 = (a + 6\/— 2) \/— 2, then x = — 25, 
whence, by (13), 

46 ^ + 2 = y\ 

1 /^ = 2 (mod 4), 

which is impossible. 

Since the only units of /?(a/^) are ±1, it follows from (13) that 

X + = (a + 6\/^)^ 

where a and h are rational integers, and equating real and imaginary 
parts gives 

— 6a5^ = x, 

30^5 - 25^ = 1. 

From the second of these equations it follows that 6 = dzl, and hence 
thatSa^ — 2=zhl,ora=±l. From the first, x = ±1^6= ±5. 

3-7 Pure cubic fields. The field L = R{^Vd), in which d > 1 is a 

cube-free rational integer and is real, is called a pure cubic field. 
In this section we determine an integral basis for L and note certain 

other properties. 

Since d is cube-free, we can write 

d = ab^j 
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where ab is square-free. 

3 


Since ^d^ = h^crb. the numbers 1, 

V^, form a basis for L. Following Dedekind, we say that L 

is of the first or second kind, according as 9 does not, or does, divide 
“ b'^. The reason for the distinction is made clear in the following 
theorem. 


Theorem 3 - 20 . The numbers 

1, ^^y^b 

form an integral basis for L if it is of the first kind. The numbers 

■5(1 + a\^ab^ + b\/a’^b), ^ab^, a'^b 

form an integral basis for L if it is of the second kind. 

Remark: Note that the second basis represents every integer 
represented by the first, since 

^ \ -\- a^y~^ + b^y^ . , 3/-^ 

^ 1“ (2'2 — azi ) \/^ + (23 — bzi ) \/a^ 

= + Z 2 ^ ab'^ -h z-^^d^b. 

Proof: Suppose that w is an integer in L, and that 

0) Xi X2^ab‘^ + Xi, X2, xz in R, 

Then the conjugates of w are 

0)' = xi + pX2^Va^ + p^xz^a^b, 

= Xi + P^X2^ab^ -|- pxz'^cPb, 

where p is a primitive cube root of unity. We see that 

CO + co' d- co" = 3xi, 

'S/a^(£o + p^co' -f poj'') = 3a6x2, 

Va^(co + pco' + p^o^") = 3a6a:3, 

and since the left sides of these equations are algebraic integers and 
the right sides are rational, it follows that the numbers Sxj, 3a6x2, 
3a6x3 are rational integers. Hence for any integer w in L, there are 
Vu 2/2, 2/3 m Z such that 

3a6co = Vi + Vz^ab^ + yz^a^d 


( 14 ) 
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We show first that ab is a divisor of j/i, 2/2, and 2/3, and so can be 
omitted in ( 14 ). 

Let p be a rational prime dividing a, and let P be a prime ideal of 
L which divides [p]. It was supposed that ab is square-free; a fortiori, 

{a, b) = I, and P\[b]. If we put 

a = d = 

then P\[af, so P^af ; since L is of degree 3 , it follows from Theorem 

2-39 that [p] = P®. Hence Pl|[a] and P^lK/ 3 ]- 
Now suppose, in accordance with ( 14 ), that 

2/1 4 " 2/2“ 4 " 2 / 3/3 = 0 (mod Zab). 

Then „ 

Vi "H 2/2^^ + 2/3^ = O(mod P )j 

yi = 0 (mod P), 

yi ^ 0 (mod p), ( 1 ^) 

?/i ^ 0 (mod P^), 
y20t + ysP — 0 (mod P ), 
y2a = 0 (mod P^), 

2/2^0 (mod p)j 


2/3/3 = 0 (mod P^), 

2/3 = 0 (mod p). 

By equations ( 15 ), ( 16 ), and ( 17 ), and the fact that p was an arbi- 
trary prime divisor of a, we see that a divides 2/1, 2/2, and 2/3. 
larly, b divides 2/1, 2/2, and ys- It follows that there are 2,, ^2, ^3 m 

Z such that . . „ ns'i 


O. ^ J- 




I ^ o 


/I 


Let the defining equation of co be 

+ cjX^ 4- C2® 4- C3 = 0 , Cl, C2, C3 in Z. 

Then by ( 18 ) and the analogous equations for 3 u' and Zo,", 

Cl = — (o) -1- 4 - ) = ~2l, 

C2 = OJOj ' + 4- “ 052223), 

C3 = -0,0>W' = - 3o52i2223). 


2l, 


( 19 ) 

( 20 ) 
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Suppose that 3|a; then 3|6, and L is of the first kind. Since C2 is 
in Z, 3|^l. Since C3 is in Z, 

0 = — 27 c 3 = 3 - 6^22^ (mod 9), 

O 


whence ?>\z2y and by (20) again, 3j23. In this case, then, the numbers 


1, ^ ah^y va^ constitute an integral basis for L. 
ment applies in the case that 316. 

Suppose now that 3 \ah, so that 

= 5^ = 1 (mod 3). 


A similar argu- 



If 3l2i, then by (19), 3122-^3; if 3|22, say, then it follows from (20) 
that also 3I23. Similarly, if 3 \z 2 , then also 3l2i and 3I23. Hence 3 
divides all or none of 21, 22, 23 ; in the first case oj is of the form speci- 
fied in the theorem. 

We now examine the possibility that co in (19) is an integer, but 
that 3|2 i 2223- Then by (20), (21), and Fermat’s theorem, 

2i^ + ab^z^ + = 0 (mod 3), 


2i + 022 + 623 = 0 (mod 3), 


Zi = 022 = 623 (mod 3 ), 

22 = 021, 23 = bzi (mod 3 ), 

22 = a2i + 3^2) 23 = 621 + 3^3. 

Substituting these expressions for 22 and 23 into ( 20 ), we obtain 

-27c 3 = zi® + ab'^iazi + Ztzf + a%{bzi + Sis)® 

— 3 abzi(azi + 3(2) (bzi -j- Sis) 

= 2i®(l + + a^b^ - 3a^b^) 

+ 9zi^(a^b^i2 + 0 ^ 6 ® is — a^bts — ab^is) 

+ 27zi{a%^t2^ + a^b^is^ _ abisis) + 27{ab'^t-^ + a%t3^) 

^ 2i®(l + a%‘^ + - 3 a%^) 

+ 9zi^{abH(si^ - 1) + a%t 3 {b^ - 1)) (mod 27). 

By (21), 

0 - - 27 cs = zi®(l + a^b* + - 3 a%^) (mod 27 ), 

and it follows that 

1 + -h ^ q (mod 27 ). 


( 22 ) 
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Using (21), we can put 

62 = + 3/ + 9g, 

where / and g are rational integers and 0 < / < 2. Then the con- 
gruence (22) reduces to 

^(/, a) = 2a® -t- (9/ - 3)a" + 9(f - f)a^ +1^0 (mod 27). 

For / = 0, this becomes 

(a^ - l)^(2a2 -1- 1) = 0 (mod 27), 

which is true for every a not divisible by 3, since 

2a2 -I- 1 = - 1 = 0 (mod 3). 

Moreover, for every a such that 3|a, 

^(1, a) ^ ^(0, a) -b 9a" ^ 9a" ^ 0 (mod 27), 

^(2, a) - ^(0, a) + 18a" -b 18a^ - 18a^(a=' + 1)^0 (mod 27). 

Thus we find that if 3iziZ2^3, than cj is in Z if and only if 

^ 0 (mod 9), 

(i e if and only if L is of the second kind) and az^ = bzs (mod 3). 
If this is the case, then Ci and C2 are also rational integers, a 

OJ = i(2, -b (a2i + 3<2)“ + 


is an integer in L. The proof is complete. 

In the course of the proof, it appeared that if « = 

‘ wl^lo^der the unite of L. If 1. ie of the tot kind, then 

^ = X + ya + Z^ 

is a unit if and only if N’l = = ±1, or 

3.3 ^ ab^y^ -b a*6z® — Sabxyz = ±1. 


( 23 ) 
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u 


- (1 + aa + b^) + va + U'fi 

3 


is a unit if and only if 

+ a-bz^ — Sabxyz = ±27, (24) 

where a = x, aw + = y, and bu + 3u; = If v is positive, the 

plus sign must be chosen in (23) and (24), since tj' and r}'^ are com- 
plex conjugates. 

The field L has the property that each of its elements is either 
rational or of degree three. For if there were an element of degree 
two, L wmuld be an extension of the field generated hy that element, 
and so would be of even degree. It follows that ±1 are the only 
roots of unity in L. Since a has one real and two nonreal conjugates, 
w'e see by Theorem 2-45 that either L has only the units ±1, or 
else there is a fundamental unit which may be chosen between 
0 and 1, such that every unit y oi L can be expressed in the form 

7 ? = 

w'here n is a rational integer, positive, negative, or zero. 

A positive unit of the form ?? = x + ?/a is always smaller than 1. 
For since + dy^ = 1, w^e have 

7?”^ == 7? '77" = — xya + y^o? > I + a + > 3^ 

since xy is negative. Consequently, for such a unit w^e have 

77 = n > 0 . 

The same remarks apply to a positive unit of the form x + z(3. 


3-8 Two lemmas. For simplicity in notation, w^e define the 


binomial coefficient 



to be zero for k > m. Here and hereafter 


in this chapter, lower-case Latin letters stand for rational integers» 
unless otherwise specified. 

Theorem 3-21. Let m be a positive integer. Then 



+ ... 7^ 0 (mod 3). 



no 


APPLICATIONS TO RATIONAL NUMBER THEORY 


[chap. 3 


Proof: Put 

^-( 2 )+( 5 )+©+ ■■ 

Then 

So + -Si + S 2 = 2" = (-ir(mod3), 


and 

S2 

Si 




4 . . - . = —mSi + Si (mod 3), 



+ . . . = mSo (mod 3), 


so that 


(1 + 2m — 7n^)So ^ ( — 1)’” (mod 3). 


Theorem 3-22. Suppose that x and y are integers such that 
(x, dy) = 1 , and suppose that 

(X + y\/dr = X + Y</d + 

where X, Y, and Z are rational and n>l. Then XYZ^O except 
in the following cases: 

- 1)® = 99 - 45a/10, 

(•^ - D* = -15 + 12^. 

p„./; Stoce (*. d) - 1. it it clear that X ^ 0. Suppose that 
2 — 0, SO that 

(2) (5) (s) ■' 


Dividing by 



this becomes 
/n - 2\ 

"" *¥i \ 3fc / + 2) 


( 26 ) 
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Let q be a prime divisor of y. Then since > 2^^ > 3A: + 2 for 
k > I, each term in the last sum is divisible by r/, which is impossible 
since (x, ^) = 1. Hence ?/ = ±1. 

When n ^ 0 (mod 3), equation (26) can be written in the form 


3(/3(n— 3) _ ^ 

k> 1 



^Sk^n—3k—3^l^{n—3)~k 

3A; T 1 


when n = 1 (mod 3), 



2x3 y ” “3 k—4 yl(n—4)—k 

{3k + 1)(3A' -h 2) 


and when n = 2 (mod 3), 


yn-2^(n-2) 



^3kyn—3k—2^^(n—2)—k 


The same argument now show's that x = ±1, and since it is clear 
from (25) that xy < 0, we have x ~ ~y. 

Now let g be a prime divisor of d, and suppose that q'^Wd (that is, 
q^\d but q^^^d). If > 5, then > 5^ > 3^' + 2 for A: > 1, so 
that each term in the sum in (26) is divisible by g, which is impossible 
since (x, d) = 1. If q = 3, then since 3-l-(3A: + 1)(3A: + 2) w^e reach 
the same contradiction. Hence = 2 or 5, and d = 2, 5, or 10. 

The information obtained so far shows that 



If d = 10, this becomes 



1 = i (n — 5)(7 i^ — 4n + 6) 
6 


k>2 \ 


- 2 
3A: 


2 • 10*= 


(3A: + l)(3k + 2) 


This equation is true for n = 5, and leads to the first of the excep- 
tions mentioned in the theorem. For other values of n, we may 
divide through by (n — 5)/6 and obtain 


n^ — 4n+6 


- - E (-1)" 
*>1 



(n-2)(n-3)(n-4) • 12 ■ 10*+^ 

3A:(3A:+1) (3k+2) (3A:+3) (3A:+4) (3A:+5) ’ 
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The highest power of 5 which divides the denominator of a term in 
the sum is clearly at most 5(3fc + 5), and since > 5(3fc + 5) for 
/c > 2, we have 


n^ — 4n + 6 

= (n-2)2+2^ 


/n - 6\ (n-2)(n-3)(n-4) ■ 12 ■ 10^ 
V 2 y 3-4-5-6-7-8 


=0(mod 5), 


which is false since —2 is a quadratic nonresidue of 5. 
When d = 2 or 5, equation (27) leads to the congruence 


1 + 





+ • • • = 0 (mod 3), 


which is false by Theorem 3-21. 

There remains only the possibility that F = 0. The proof that 
this happens only in the case of the second exception mentioned in 
the theorem is completely similar to what has just been done for the 
case Z = 0, and we leave the details to the reader. (The only varia- 
tion lies in the fact that d may now have the sole prime divisor 2, 

so that d = 2 or 4.) 


3-9 The Delaunay-Nagell theorem. As we shall see in the next 
chapter, there is a general theorem which implies that the equation 

ax^ -b 61/3 = c (28) 

has only finitely many solutions in integers x, y if a, b, and c are 
nonzero integers. In certain special cases, however, it is possible 
to make more precise statements about the number and nature of 
possible solutions. We shall concern ourselves here with the equation 

x" -b dy^ = 1, (29) 

which was first considered in detail by B. Delaunay. His method 
was later refined by T. Nagell, who also applied it to (28) m the case 
that c = 1 or 3. Nagell's result concerning (29) is as follows. 

Theorem 3-23. Equation (29) has at most one solution in integers 
X, y different from zero. If x,, yi is a solution, the number x, + yiVd 
is either the fundamental unit of L = R{<^) or its square; the 
latter can happen for only finitely many values of d. 
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If d = ±1, (29) has only tri\Tal solutions. If d contains a cube 
larger than 1, it can be absorbed into the factor y , Hence we can 
assume that d is cube-free and larger than 1. 

The idea of the proof is quite simple. If 

N(xi + 2/1 = b ?/i ^ 0, 

then x\ + is a positive unit of L, and as such is a positive 

power of the fundamental unit J mentioned at the end of Section 3-7. 
It therefore suffices to show that no power of a positive unit smaller 

than 1, with exponent larger than 2, is of the special form x + y^d, 
and to show that the square of a unit is of this form in only finitely 
many cases. We divide the proof into four parts, summarized in the 
next four theorems. 

Theorem 3-24. The square of an irrational unit of L of the form 

7} = X + ya + zl3, x,y, zin Z 
is itself of the form X + Ya only if 

7 , = 1 + - -yym. 

The square of a unit of L of the form 

7, = i(x + ?/a + z0), 3\.ryz, 

{if such exists) is itself of the form X Va for only finitely many 
values of d. 

Proof: Let y\ = x + ya z0 

be a positive unit of L, so that, by (23), 

+ ah'^y^ + a%z^ - Sabxyz =1 (30) 

and 

= {x^ + 2abyz) + {2xy + az^)a + (2xz + by^)p. 

If the coefficient of /3 in this last expression is 0, then 



and substituting this into (30) we obtain 

x^ + dy^ - + = 1 , 
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or 

(fyG _ 20x^ dy^ — 8(x® — x^) =0, 


whence 

dy^ = lOx^ zL 2xV27x‘^ — 2x. 

(31) 

Thus the number 27a'‘ - 2a must be a square : 



(27a® - 2)a = <®. 

(32) 


If X is even, then {27x^ — 2, x) = 2, so that 

27x® - 2 = ±2u^, X = ±2y2. 

Since —1 is a quadratic nonresidue of 3, we must choose the lower 
sign, and eliminating x we obtain 

108y® + 1 = 

(w - 1 )(m + 1) = 108y®. 

Since (u - 1, m + 1) = 2, this implies that 

M ± 1 = 54r®, It =F 1 = 2s®, 

whence 

27r® - s® = (3r^)® - (s^) = ±1. 

From the truth of Fermat’s conjecture for n = 3, it follows that r = 0, 
which gives t) = 0 and a: = 0. But then also y = 0, by (31), which 
is impossible since z0 is not a unit. 

If X is odd, (32) yields 

27a:® — 2 = x = 


Here the upper sign must be chosen, and we have 

(3a:)® = u® + 2, 


which by Theorem 3-19 has the sole solution x = 1 , u = ±5. By 
(31), dy® = 10 ± 10, so that d = 20, y = 1- (If y = 0, then z - 0, 

and r, is rational.) The sole solution is therefore 

(1 + - ■V’^)® = -19 + 7v^. 


Now let y be a positive unit of the form 

V = 3(® + ya + 2^)' 

Then by (24), 

3.3 ^ ob®y® + a®6z® — Sabxyz => 27 , 

9,2 = (x® + 2 abyz) + (2xy -f- oz®)a + (2*2 -f by‘)fi 


(33) 

(34) 
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2 — 9 1 A, ± X AU A ♦, 

If 3lx, also Sly and 312, and we have already treated this case. Sup- 
pose that 3|x. If the coefficient of (3 in the expression for is 0, we 
again have 

by^ 


z = — 


2x 


Substituting this into (33), it follows that 



dy^ = lOx^ ± 6x's/3x'^ — 6x, 

(35) 

so that 

3x‘^ — 6x = 

(36) 

If X is 

even, the fact that 3|x implies that 



x^ — 2 = =b6u^, X = zL2v^y 


whence 

rt4t^® — • 1 = ±3w^. 



Since 3|(4v® + 1), we must choose the upper sign; the last equation 
can then be written as 

(u+ 1)^ - (u - 1)3 = {2v^f, 

* 

so that |u| = 1. Hence x = 2, and by (35), dy^ = 80 ± 72. The 
lower sign yields d = 1 or 8, both of which are excluded. Hence 
d = 19, ?/ = 2, and 2 = — 1. The only solution in this case is 

^ 2 + 2v^ - 

If X is odd, (36) implies that 

— 2 = dbSu^j X = ztV^f 

so that 

- 3 = ±3m3. 

The lower sign must be chosen : x = — and 

3u 2 _ 2 = t;6. (37) 

But it is an immediate consequence of Theorem 4-17, to be proved 
in the next chapter, that (37) has only finitely many solutions, and 
the proof is complete. 

We note for future use that if u, v satisfy (37), then v must be odd. 

Theorem 3-25. The fourth power of a positive irrational unit of L is 
never of the form X + Ya. 
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Proof: Let e be such a unit, 

e = 1(^1 + + 2i/3), 

and suppose that 

= X + Ya. 

Then since the coefficient of (3 in e'* is 0, we have 

66x1^2/1^ + 4xi^0i + ‘lab'^yi^zi + l2abxiyiZi^ + a^bzi'^ = 0. (38) 


If we put 

r? = = ^(x + 2/a + zP), 

then 

X = ^(xi^ + 2abyiZi), 

y = i(2xi2/i + azi^), 

z = ^(2xiZi + byi^). 

Since ri^ = X + Ya, we can apply Theorem 3-24. The cases 

d = 20, X = y = —z = Z, 

d = 19, X = y = 2, 2 = -1 

are impossible, since in the first the above equation for 2 becomes 
— 9 = 2 xi 2 i + 2yi^, while in the second the system is easily seen to 
be inconsistent for all choices of signs of Xi, yi, z^. Hence it must be 
that X = — where v is odd, so that 

+ xi^ = —2ahyiZi. 

Since v is odd, so is Xi, so that 3^^ + Xi^ = 4 (mod 8). Hence three 
of the numbers a, b, y\, z\ are odd, and the fourth is even. By ( )> 

0^621^ is even, so i/i is odd. If either a or 2j is even, (38) implies 
that 66x1^2/1^ = 0 (mod 4), which is false. If b is even, (38) >io^*es 
that a%zi* = 0 (mod 4), which is false since b is square-free. The 

proof is complete. 

Theorem 3-26. The cube of a positive irrational unit of L is never 
of the form X + Ya. 


Proof: If 


7 ; = \ix yoL -\- zp) 


is a positive unit, the coefficient of ^ in t; is 

^(bxy^ + x^z + abyz^). 



3-91 'THE DELAUNAY-NAGELL THEOREM 117 

We see from the equation 

+ ab‘^y^ + a~bz^ — Sabxijz — 27 (39) 

that (x, b) = 1, and deduce from the equation 

bxy^ + x^z + abyz'^ — 0 (40) 


that b\z. From (39) again, 6 = (x, y, z) = I or 3. Since n ±1, 
y and z are not both zero, and we can write 

X = SdidsXl, y = 5^2^31/1, 2 = 66 ^ 1 ^ 321 , (41) 

where 



and xi > 0,yi > 0, Zi > 0. The numbers diXj, d 2 yu are rela- 
tively prime in pairs. Substituting the values from (41) into (39) 
and dividing by b^bdid 2 d 2 , we obtain 

d2^d3Xiyi^ -h di^d2Xi^2i + ab^didz^yiZi^ = 0. 

It follows from this that di|xi, d 2 l 2 /i, and c/aUi- Putting 

Xi = diX2, 2/1 = d2y2, = dsZ2, 


substituting, and dividing by did 2 d^, we obtain 

d2X2y2 + di^X2^Z2 + ab^dz^y2^2^ = 0 . 

A consequence of this is that X 2 \ah'^d:^y 2 Z 2 ^ j which in turn implies 
that X 2 = 1. Similarly, 1/2 = 2:2 = 1, so that 

+ d2^ + = 0 (42) 

and 

X = hdi^d2y y = ^d>^dzy z = hbdidz^. 

Substituting these values into (39), we obtain 

di«d2* + - 3abWd2W = ^ ' 


Eliminating ab^dz^ between this equation and (42), we have 

di® + 6di«d2" + 3d,%^ - d2» = . (43) 


and putting di^ = u, ^ 2 ^ = y, 3/5 = Wj this becomes 

-f- 6u^v + 3m;^ _ 


( 44 ) 
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But it is easily verified that 

(u" + 6u^v + _ ^3)^3 = v^ + 

where 

^7 = _j_ _l_ ^2^ y = 1^3 _j_ 3j^2y _ y3^ W = 3u^v + 3uv^. 

Since neither U nor V is zero for relatively prime u and y, (44) can 
hold with 10 7 ^ 0 only if \V = 0, that is, if w = — v. In this case 
w = V. Since (di, ^ 2 ) = 1, it follows that di = — 1, d 2 = 1, 5 = 3. 
This, however, leads to the values x = 3, y = —3, 2 = 0, for which 
the coefficient of in 7 }^ is not zero. 


Theorem 3-27. If p > 3 is prime and 

V = + 2/a + zP) 

is a positive unit smaller than 1 , then is not of the form X + Ya. 
Proof: Suppose that 2 = 0. Then 3lx and 3\y, and 


Nr; 


= (0 


+ d 


( 1 / = '■ 


so that ^ 1 ) = 1 Theorem 3-22, the coefficient of /3 in is 

not zero. Thus z 9 ^ 0. By the same reasoning (applied in the field 
L' = R{0) = R (a) = L),y cannot be zero. 

As we saw in the proof of Theorem 3-20, it follows from the 

representation 

0 ) = Xi X 20 : + X 2 P 

of an arbitrary integer w of L that 

oi(cij T" "T = 3ahxz‘ 

Taking a? = 7 ?^, we see that if the coefficient of P in 7 ;^ is zero, it must 
be that 

+ ya + z0 y p + yp« + 


+ p 


X + _ Q 


(45) 


Suppose first that p = 1 (mod 3). Then since = p, (45) can 
be written in the form 
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Xp + yp^Oi + 


+ 


+ ypa + _ 


X + ya + 

3 / 


Since p is odd, the left side is divisible by 
xp + yp'^a + g/3 xp^ + ypa + z{3 

I 


X — ya + 2zf3 


3 


this number is an integer, and since it divides it is a unit. Conse- 
quently, 

— — ab-y^ + — dabxyz = ±27. 


Since y; is a positive unit, also 

x^ -|- ab‘y^ + a^hz^ — Sabxyz = 27, 

and by addition, 

9a^62^ — 9abxyz = 0 or 54. 

In the first case az^ — xy must be zero. But this number is the 
coefficient of a. in S/??, and as we saw at the end of Section 3-7, it is 
not zero, since I/t? > 1. 

In the second case we have 

abz{az^ — xy) = 6. 


But then x, y, z are not all divisible by 3, so that L is of the second 
kind. This is impossible, since if a6|6 then 0 (mod 9). 

The case in which p = 2 (mod 3) proceeds similarly. Equation 
(45) can be written in the form 


xp2 + ya + zp0Y /xp + yoL + zp^0\P 


from which it follows that the number 
xp^ + ya + zp0 ^ xp + ya -j- 

3 3 




+ ya + z0 



— X + 2ya — 2/3 


is a unit. As before, 

9a6^y^ — 9a6xy2 = 0 or 54. 

Since by^ - X2 is the coefficient of /3 in 3/rj, it is not zero. But it is 
also impossible that (by^ — X2)l6 and a6|6, since then L must be of 
both the first and second kinds. The proof is complete. 
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Theorems 3-25, 3-26, and 3-27 show that any nonzero solution of 

+ dy^ = 1 must correspond either to the fundamental unit of L, 
or to its square. Not both of these numbers can lead to solutions, by 
Theorem 3-22 with n = 1. This completes the proof of Theorem 3-23. 
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CHAPTER 4 


THE THUE-SIEGEL-ROTH THEOREM 


4-1 Introduction. It is shown in introductory texts in number 
theory* that if a is a quadratic irrationality (that is, an algebraic 
number of degree two), then there is a positive constant c such that 

c 

for every pair of rational integers p, q with g > 0. The idea used 
there suffices to prove the following generalization, which is due to 
J. Liouville. 



Theorem 4-1. // a. is an algebraic number of degree n > 2, then 

there exists a positive constant c such that 







for every pair of rational integers p, q with q > 0. 


Proof: Let a be a zero of the irreducible polynomial 

/(^) = + * • • + a„, Go > 0, 


with coefficients in Z, and let ai = a, ^ 2 , . . . , oLn be its conjugates, 
so that 

f{^) = ao(x — Ot){x — 0 L 2 ) ■ ■ • {x — an). 

Then the number 


gV = Gop" + Gip^-ig a„g” 

is a rational integer different from zero, and it therefore has absolute 

* See for example, Volume I, Section 8-4. In Section 8-5 Hurwitz' 
theorem is stated and proved, and in Chapter 9 the problem of approxi- 
mating real numbers by rationals is considered; all this material is assumed 
in the present section. 
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Put 





1 

n 

aoq"' n 

k=2 

P 

OCk 

q 

— " n 

ao9" n 

k=2 

P 

Oik 

Q 


= \~^ = max (|al, . . . , knl)* 


We consider two cases, according as \p/q\ is greater than or not. 
In the first case we have the trivial lower bound 


P 

a — — 

Q 


> > — 


In the second case the inequality \ak — p/g\ < SP holds for 
fc = 2, . . . , n, and, by the inequality of the preceding paragraph, 


P 


a — 


> 


1 


( 3 ^) 


n—1 


Thus, the theorem holds with 



min 



Liouville used this theorem to show the existence of nonalgebraic 

numbers; this will be discussed in detail in the next chapter. At the 

moment, let us consider a hypothetical improvement of Theorem 
4-1, in which the inequality (1) is replaced by 






where v is any number smaller than n. A. Thue noticed that if 
such a theorem could be proved, it would have the important conse- 
quence that the Diophantine equation 


q”f = OoP” + flip" *9 -f ■ • • + dnq” — (3) 

can have only finitely many solutions for any fixed rational integer A 
different from zero, if f(x) has distinct zeros. To see this, let the zeros 

of /(*) again he ai = a, . . ■ , cin, and put 


7 = min (|a,- - ayl). 
i^j 
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Suppose that (3) has infinitely many solutions p, q. Then there 
must be at least one which by suitable naming we can take to be 
a, which is a limit point of the numbers p/^, since otherwise the 


quantity 



= n 

k = \ 



is certainly not bounded as q increases indefinitely. There 
must therefore be infinitely many solutions of (3) for which 
|q: _ j)fq\ < y/2. But for all such solutions, 




A 


n 

aoq" n 

V 

OLk 

II 

Q 



and this is at variance with (2) if is a constant smaller than n and 

q is sufficiently large. 

Thue showed that (2) holds with 



Later C. L. Siegel improved Thue’s result, showing that (2) holds 
with 

V > min ( ^ ^ + s ) > 

l<8<n “I \s -r 1 / 

and in particular with p — 2'\/n. In 1947 F. J. Dyson made the 

further improvement v > \^2n, and finally in 1955 K. F. Roth 
proved that (2) holds with ^' = 2 + e, for each e > 0, for all but a 
finite number of fractions p/q. This is the best theorem possible if 
V is to be independent of since Hurwitz^ theorem shows that the 
corresponding statement is false for every irrational algebraic number, 
for p = 2 and suitable c. Rothes work is similar in some respects to a 
simplification of Dyson's proof, published by T. Schneider in 1948. 

In addition to the problem of sharpening Theorem 4-1 by decreas- 
ing the exponent of g, we may also consider the question of extend- 
ing the methods so as to analyze the approximability of an algebraic 
number by other algebraic numbers. This is not mere generalization 
for its own sake: as we saw in the preceding chapter, it is natural to 
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consider the solvability of in a larger set of integers 

than Z, and the same is true of many other Diophantine equations. 
But if the variables in an equation range over the integers of an 
algebraic number field, then to the extent that approximation 
theorems are useful at all they must be formulated in terms of alge- 
braic rather than rational numbers. 

While Siegel gave many algebraic variants of his basic result, Both 
presented a detailed proof only in the rational case. In this chapter 
we give a complete proof of a useful algebraic version of Eoth's 
theorem. Unfortunately, the proof is complicated; the student 
might profit by first examining Schneider's work mentioned above. 

We shall proceed as follows. In the next three sections we shall 
make some definitions, and obtain some preliminary results, which 
are needed for the proof of the main theorem: in Section 4-2 some 
properties of polynomials will be treated, in Section 4-3 the concept 
of the generalized Wronskian will be introduced, and in Section 4-4 
the index of a polynomial will be defined and discussed. Then we 
shall proceed to prove, in Sections 4-5 and 4-6, several lemmas on 
which the proof of the main theorem depends, and finally, in Section 
4-7, we shall state and prove the Thue-Siegel-Roth theorem itself. 
In the remainder of the chapter, some applications of the theorem 

will be taken up. 

4-2 Polynomials. If Piz) is a polynomial with arbitrary complex 
coefficients, we denote by l|Pll the maximum of the absolute values of 
its coefficients. If a is an algebraic number and P{z) =0 is its de- 
fining equation, so that P is irreducible and has relatively prime 
coefficients in Z, we define the height H {a) to be \\P\\. Finally, if P has 
algebraic coefficients, we designate by 1^ the maxiimm of the 
absolute values of their conjugates. Clearly |1P|| = [fl if^^ fias 
coefficients in Z, and for a nonzero constant polynomial P ( 2 ) = « the 

new definition of 0 agrees with the old one. 

Except when a polynomial is written as a determinant, it will be 
supposed that no two terms have the same exponents on the variable, 
or sets of exponents on the variables. 

Theorem 4-2. Letly'Ku • • • , Xa he complex numbers^ and put 

h 

L(z) = I TL (z — Xjfc)* 

k^l 
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Then 


n (1 + IX.I) < G^llLll 

k=l 


Proof: There is no loss in generality in supposing that ? = 1, since 
a change in I affects in the same way the two sides of the inequality to 
be proved. Let Xi, . . . , X( be those of the X s such that jX^:! ^ 2. If 

f{z) = n (e — Xjt), then there is a complex number zq with | 2 ol = 1 

i = i 

for which l/(2o)l > 1. To see this, let 6 be a (^ + 1 )th root of unity, 
and suppose that 


f(z) = H 

r=0 


fit = L 


Then 


t 


Z Z = Z Mr Z « 

.=0 r==0 r=0 v =0 


y(r-hl) 


But 


z * 

► =0 


»('-+l) _ 


0 

t + 1 


if {t + l)|(r + 1), 
if « + l)\{r + 1), 


and since r < t, (/ + OK?" + 1) if and only if r — Hence 


(4) 


Z = (« + l)/it = < + 1, 

► =0 

so that one of the < + 1 numbers l/(e'')l is at least 1. Thus 


t 


n (1 + IXfcl) < (1 + 2)‘ = 3‘ < 3‘ 

*=i 


n (zo — Xfc) 

t =1 


If t < h, then for fc = J + 1, . . . , we have IXtl > 2 and 

_ IXtl + 1 

jzo — Xfcl “ jXtl — jzol |Xtl — 1 


1 + IXfcl ^ 1 + |Xfc 


= 1 + 


1X*1 - 1 


< 1 + 


(5) 


2 - 1 


= 3, 


so that 


n (1 + IXfcl) < S'"-' 


i=(+i 


ii (zo - Xfc) 


k =1+1 


Combining this with (5), we have 


n (1 + iXfci) < 3 

A; =1 


n (zo - Xfc) 

jfe =1 


< 3^11L1|(|0o1* + ■ ■ ■ + 1) 


= 3^(h + 1)1|L11 < e'-llLil. 
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Theorem 4-3. Suppose that /(z) and g{z) are polynomials with 
complex coefficients, of degrees n and m respectively. Suppose further 
that the coefficient of z"* in g{z) has absolute value at least 1. Then 

Proof: Let 

f{z) = ao{z — \i) ‘ {Z — Xn)) 
giz) = ^0(2 — X„+i) • • - (2 — Xn+m)- 



Then 


ll/ll < 


n 


ao n (2 + [XaD 

*=1 


n 


< \aoho\ * n (1 + |Xa:1) 

k=l 


n 

< lao^ol II (1 + l^fcl). 


k=i 


and the desired result follows from Theorem 4-2. 


Theorem 4-4. If /(*) arbitrary polynomial of degree n, 

with real coefficients, then 

< (mn + l)l|/’"ll- 



( 6 ) 

Pr-nnf- T et f(z) = an + ai2 + H P® 

theorem is certainly true if either Kl = a or |a„| = a, smce the first 
and last coefficients in/^fz) are the mth powers of ao and a„, respec 

tively, so that in this case \\r\\ > H/ir. If 


r(2) 




then clearly 


iirii = 



and lurrii = ii(r)*ii, 


so that we can suppose, with no loss in generality, ^that the numerically 
largest of all the coefficients m /(z) is a„ where ^n < t < n. 


giz, 6) = m - ae'Oz’', 


Put 


and let a = aid) be the numerically largest of the zeros of (/(z, 8) or 
each e. The inequality (6) holds if, for some 0, 1 a (0)1 > 1- For 

\f”'ia)\ = |ae‘«a"l’" = a^lal"”*, 


while for \a\ >1, m i 

ir(a)i < imid + H + • • ■ + i«r) ^ 


mn 
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a”' = 11/ir < + !)• 

We know that la,i| < a. Hence if /(I) > a, then 

g(l, 0)>a-a = 0, 9 (^, 0 ) == - 

and 1 < \oc{0)\ < 00 . Similiarly, if /(I) < —a, then 

(7(l,7r)<— a + a = 0, ^/(oo,7r)= co, 

and 1 < \ci{Tr)\ < oo. This proves the theorem unless |/(1)1 < a, 
which we henceforth assume. 

Now put z = e^, so that 

g(e^,e) = f{e^) - 

If we find a <^0 such that = a, then Oq can be determined so 

that Oo) =0; this gives la(0o)l > 1 and proves the theorem. 

Since l/(e^)l is a continuous function of tp, and since 1/(1)| < a, it 
suffices to prove the existence of a such that |/(e^'^)| > a. 

Let € be a primitive (t + l)th root of unity, where jad = a and 
^ < t < n. Then 

i; = i: a,i: 

„=0 *-=0 *=0 k=0 V~0 

Since k < n and t > ^n, we have that (^ + l)l(fc + 1) if and only if 
k ~ t. Hence, by (4), 

i: e^fie^) = at{t+ 1), 

v=0 


so that for some 


|€‘'/(€*')l = l/(e'')l > |a,| = a. 

The proof is complete. 

Theorem 4-5. If fi(z)y . . . jftiz) are polynomials with algebraic 
coefficients, then 


n /, 

y~l 


< 11 (1 + deg/j n fy 7 | • 

y =1 V =1 


Proof: There is no loss in generality in supposing that 


deg/i > deg /2 > • ■ • > degft. 
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The product / 1/2 is a polynomial each of whose coefficients is a sum of 
products of a coefficient of fi and a coefficient of f 2 , the number of 
summands being at most 1 + deg/ 2 . Hence 

I/1/2I = (1 + deg/2) f/Tl f/il- 

Similarly, 

I/1/2/3I (1 d-deg/a) I/1/2I < (1 +deg/ 3 )(l+deg/ 2 ) 

and so on. 


Theorem 4-6. Let p and r he positive integers^ with 1 < r < p. 
Suppose that F(zi, . . . , 2p), G(zij . . . , 2r), H(Zr-\-i, . . . , Zp) ore 

polynomials with coefficients in an algebraic number field Kj those of 
F being integerSj and suppose that 


F iZ\j . . . j 2p) — G{Z\^ . . . , Zr^H (Zr-\-'lf . . . , 2p). 

Then if y is any coefficient in F, there is a factorization y = in K 

such that the coefficients in aH and fiG are integers in K. 

Proof: Let the coefficients in G be aj, . . . , and those in H be 
i3i, . . . , in some order. Then, since the variables in G and H are 
disjoint, the coefficients in F are simply the products a A*. Since the 
coefficients in F are integers, all the products . . . , are 

integers, as are all the products fijau ^ But these two sets of 
numbers are just the coefficients in cciH and PiG. 

4-3 Generalized Wronskians. Polynomials fo{zi, . . . , Zp)y . . . j 
f. .(z. Zr.) with coefficients in an algebraic number field K are 

said to be linearly dependent if some linear combination of them, with 
constant coefficients in K which are not all zero, vanishes identically, 
and are otherwise said to be independent. In the case of a single 
independent variable, it is well known that the question of independ- 
ence of a set of functions can sometimes be settled by reference to 
t’leir Wronskian. For our purposes it is convenient to define this as 

the determinant 

/i d^ 


W{z) = det 


Va. ! dz'' 




H,V = Oy 1, 


l- 1, 


which differs from the usual definition only in the presence of the 
nonzero constant factor 

1 


nil! • • • a - 11! 


t 


I 
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The exact relation of the beha\ ior of the Wronskian to inclepeiulciice, 
as applied to polynomials, is indicated in the first part of the next 
theorem. 

For functions of several variables, the situation is not quite so 
simple, since there are then several partial deri\ati\’es to consider. 
We proceed as follows. Let Aq, Aj, . . . , , A/_i be differential 

operators of the form 

- 'Wj ’ 

such that the order + ■ • • + jp of does not exceed for 
0 < M ^ ^ ~ 1- Then the function 

Ao/o ^of\ • ■ - A,-)//— 1 

•^i/o ^i/i ■ - • A]/;_i 

• • 

• • • 

♦ . . 

A/_i/o A/_]/i . . . A/_i//_i 

is called a generalized Wronskian of /o, . . . , //_i. Except in the trivial 
case p = I = 1, there are several A^’s for each /i, and hence more 
than one generalized Wronskian. In the case of functions of one 
variable, the ordinary Wronskian is that generalized Wronskian for 
which the order of A^ is exactly for 0 < < I — 1. 

Theorem 4-7. (a) If fo, • ■ • ,ft~i are I polgnoynials over K in the 
single variable 2, whose Wronskian W{z) vanishes identically, then 
they are dependent over K. 

(b) If /o » are I polynomials over K in the variables 
^1) • • • j foT which every generalized Wronskian Gi{zi, . . . , Zp) 
vanishes tdenizcally ^ then they are dependent over K.. 

Proof: (a) The proof in this case is by induction. If / = then 
W iz) = fo{z)j and the truth of the theorem is obvious. 

Take I > 1, and suppose that the theorem is true for every set of 
Z — 1 polynomials, /o, /i, . . . , fi- 2 , over K; suppose also that the 
Wronskian Wi of /o, . . . ,fi—i vanishes identically. If /o, . . . , fi -2 
are dependent, so are /o, . . . and the assertion is proved. 

Suppose then that /o, . - . , //_2 are independent, so that their Wron- 
skian Wi^i is not identically zero. Now HVi, being a polynomial, 
has only finitely many zeros; let 7 be an interval in which it does not 
vanish, and take z in I. For such the system of equations 
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k=0 


j = 0,1, . . . ,l - 2, 


( 7 ) 


can be solved for the y’s as rational functions of z. But then, by sub- 
tracting appropriate multiples of each column of Wi from its last 
column, we obtain 


0 = 1 ! ■ . • (1 - 1 )! 


/o(z) 

/o' (z) 


/i(z) 

/i'(z) 


» • 


0 

0 


1-2 


fl 


... - T. fk 

k = 0 


(l-l) 


iz)yk 


= l!-- - (Z 


- i)\(f?s,^\z) - Z fk^^''Hz)yk) Wt^ 
\ *=0 / 


so that also 


1-2 




( 8 ) 


k *0 


Differentiating (7) gives 


{z)yk + e' fk’' = /li+i*’ W. 


i: fk 


?' = 0, . • • » ^ 2, 


fc = 0 


fc = 0 


and comparison of this for^ — Z — 2 with (8), and for j 0, 
with (7), shows that 


..,Z-3 


'i: fk^^H^)yk = 0, 


j = 0, . . . , z - 2. 


*=0 


Since Wi—i ^ 0, it must be that 

yo' = • • • = y'u-2 = 

so that the y’s are constants, say yk = Ck, and they are clearly m K. 
But then the polynomial 

vanishes throughout I, and therefore identically, so that the i poly- 
nomials /„, /i, /«-i a^e dependent. . m.*l>fhc 1 

(b) This case is proved by contradiction. Suppose- - - 
polynomials /o(z», . . . , z,). • - • , /i-i (zi, - • • . ^p) are indepehw. 
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^ ' 3| VJI A AJA W ** 

and suppose further that for each v, /, is of degree less than k in each 
of its arguments, so that we can write 


k-i 


k-l 


fy{Zl, . . . , 2p) = n ■ * • H by(ki, . . . , kp)zi^^ ■ ■ • z 


kp 
p f 


k, =0 


A-p=0 


0 < < / - 1 . 

Then the polynomials /,(^, . . . , are linearly independent. 

For otherwise there would be an identity in t of the form 


l-l k-l 


k-l 




= 0 ky =0 


kp=0 


or 

k-l k-l /i-i 

* ■ ' S { Cyby{kl, . . . , kp) 

ki =0 kp =0 \»» =0 

and it would follow from the uniqueness of the representation of an 
integer to the base k that for each set of exponents ki, , kp, 


^ ^^1+^2^+ * * • _ Q 


whence 


i-i 

X ^*'^*'(^1) • * • > ^p) 
»« *0 

i-i 

X ^yfvi^lf • • • ) ^p) 

i.=0 


contrary to assumption. 

We know therefore that the Wronskian 


W {t) = det 





does not vanish identically. By a standard differentiation formula, 





and it follows easily by induction on fi that an operator identity 


dt/ 

holds, where . . • » differential operators of orders not 

exceeiUn r depends only on ^ and p, and , •pr are poly- 

t ^is with rational coefficients. Using this in the above expression 
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for W{t)j and writing the resulting determinant as a sum of other 
determinants, an expression for W (t) of the form 

W(t) = xf'iiOGiit, H h ..., 

results, in which . . . , s-re polynomials and Gi, . . . , are 
generalized Wronskians of /i, . . . ,//— i- Since W {t) does not vanish 

identically, there is an i for which Gi(^, . . . , ) is not identically 

zero, and a fortiori Gi{z\j . . . , Zp) is not identically zero. 

Theorem 4-8. Let R{zu • . • , a polynomial in p > 2 vari- 

ables^ with integral coefficients in K such that 

0 < Hi] < B. 

Let R be of degree at most rj in Zj^ for j = 1, . . . , p. Then there is 
an I in Z with 

1 <l<rp + l, (9) 

there is an integer ^ in K, and there are differential operators 
Ao, . . . , A;_i on the variables 2 i, . . . , 2p-i, of orders at most 
0, . . . , Z — 1, respectively y such that if 

F{zr, ...,z„)= ^ u = 0, . . . , I - h (10) 

then 


• * V 

(a) F has integral coefficients in K and is not identically zero; 

(b) a decomposition 


F{ziy . . . , 2p) = (21, . . . , Zp^i)V{zp) 


( 11 ) 


holds, where U and V have integral coefficients in K, U is of degree 
at most Irj in for j = h ■ ■ . , p - I, and V is of degree at most 

Irp in Zp; 

(c) the following bound holds : 

[^ < Kn + 1) • • • (rp + 

Proof : Write K as a polynomial in z^: 

Tp 

R(zi , . . . , 2p) = S„(zi, . . . , Zp-\>p'- 

The Dolvnomials S, need not be independent ; let (zi , . . . , 2p-i ) . 

, r, i — 1 be a maximal set of independent polynomials 

tor V — u, . • • > > 



GENERALIZED WRONSKIANS 


133 


among the so that 1 < / < + 1. Then there are constants 

in K such that for x = 0, . . . , rp, 

1-1 

S^{zi, . . . , 2p_i) = 'll , 2:p-l)- (12) 


= 0 


If we put 


then 


^ ^VX^P J 0, . . . , / 1, 

* =0 


1-1 


(13) 


R{zij • * * > ^p) — ^ - • ■ ) ^p — i) <Pv(^p)f 

y =0 

and (fioj , . . y (pi-i are independent. For if 5o, . . . , are constants 
such that 

^O^o(2p) + ■ ■ ■ + (2p) = 0, 


the coefficient of each power of Zp must be zero, so that 

5o0ox + * • ■ + 8i-i0i^i,x = 0 


(14) 


for X = 0, , T-p. For fixed vq with 0 < i^o < ^ — 1, choose xq so 

that »S,^( 2 u . . . , 2p-i) = ’/'.o(2i, . • . , 2p~i); this is possible since 
the ^p's are a subset of the S*s. Then (12) shows that 

1 if V = Vq, 

0 if V 7^ vq' 


fiyxQ j 


Choosing x = xq in (14), we obtain = 0. Since i^o is arbitrary, 
every 6t = 0. 

Let TF( 2 p) be the Wronskian of • • • , ^t-i J it is a polynomial 
with coefficients in K, and it does not vanish identically. Let 
G(zi, . . , y Zp-i) be some generalized Wronskian of \1/q, ... y 
which is not identically zero. Then 


=det(^(^) Mzp)) 


fly V = Oy . . . y I 1, 


of 


G(zu . . . , Zp_i) = det {A^'pvizi, . . . , Zp_i)), 

where Aq, . . . , Ai_i are differential operators on 2 i, . . . , 2 p_i, oi 
orders at most 0, . . . , i — 1 respectively. Taking the row-by-row 
product of G and TF, we obtain 


GW = det 


CC ( 


dz 


^p(Zp) ^pi^ly ' ' ' y 2p— l) 
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Since IF is a determinant of order I whose elements are polynomials 
in Zp of degrees at most rp, it is clear that deg W < Ivp. Similarly, 
G is of degree at most Irj in Zj, for j = — 1. 

In the expression (15) for GW^ we can write R as the sum of 
(ri + 1) - - - (rp + 1) terms of the form 

^«l*'*«p^l ^ * * * 2p 

The determinant can then be written as a sum of 

((r, + 1) • • • (rp + 1))' 


new determinants, each having entries of the form 




> 


in which tj < sy for j — 1, . . . , p. Here 

“ " (i!) "iZ 

Thus the entries of each new determinant are such that the maxima 
of the absolute values of their conjugates do not exceed 


■< 2*^^ 2*^ 


+ 4 tp 


2n + ---+rp^^ 

and hence 

1^ < ((ri + 1) • ■ • (rp + 

The coefficients in GW are integers in K. It follows from Theorem 
4-6 that if /3 is any one of them which is not zero, there is a factori- 
zation ^ = ^ 1^2 in K such that = U and = V have integral 

coefficients in K, and 

fiGW = F == UV. 

By the bound just obtained for Ww\, we have 

0 < 1^ < m‘^< (('•i + 1) • ■ • (rp + + 

4 

4-4 The index. Let P(2i, • • • , Zp) be any polynomial in p vari- 
ables which does not vanish identically. Let ay..., a^ be any 
complex numbers, and let r,, . . . , rp be any positive numbers. We 
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define the index 6 of P at the point (ori, . . . , ap) relative to , Vp 

as follows. Expand F{ai + 2 /i, . . . , ap + pp) as a polynomial in 
2 / 1 , , 2 /p, say 


30 


CO 


P{<^i + 2/1, • • • , cKp + //p) = Y. ■ • ■ Y c(ji, . . . , jp)yx^^ ' ' ' Up 

Ji =U ;p=0 

Then 




6 — min I — + 


+ 


Jp 


the minimum being extended over all .sets of non-negative integers 
jiy • ‘ • ,3p for which c(ji, . . . yjp) ^ 0, or, equivalently, for which 




d 


dz 


. P(ai, . . . , ojp) 5=^ 0. 

1 / \dZp/ 

Note that 0 > 0 always, and that ^ = 0 if and only if 

P(ai, . . . , oip) 9^ 0. 

Moreover, if any derived polynomial 

d 


dZi 


dz 


P(^l, • • • , ^p) 


. , ap) rela- 


is not identically zero, it is clear that its index at (ai, . 
tive to ri, . . . , Tp is at least 

0 — — — . . . 

rp 

The following properties, which we list in a theorem for later refer- 
ence, are also immediate consequences of the definition. 


Theorem 4-9. Let P{zu--^yZp) and Q{zu...,Zp) be poly- 
nomials j neither of which vanishes identically. Then if we consider 
indices formed at the same point (ai, . . . , ap) relative to the sa?ne 
numbers ri, . . . , rp, the following relations hold: 

index (P + Q) > min (index P, index Q), (16) 

index PQ = index P + index Q. (17) 

Equation (17) remains true if P is a polynomial in zj, . . . , 2 p_i only, 

and Q is a polynomial in Zp only, provided that the index of P is 

taken at (a^, . . . ^ ap_i) relative to rj, . . . , rp_i, and that of Q at 
ap relative to rp. 
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Now let 7*1, . . . , be positive integers, and suppose that > 1. 
We consider the set 91^ ~ 9lm(B; ri, , r^) of polynomials 
B(zi, . . . , z,n) which satisfy the following conditions: 

(a) R has integral coefficients in K, and is not identically zero. 

(b) /? is of degree at most rj in Zj, for j = 1, . . . , m. 

(c) 17^1 < B. 

Let fi, . . . , fm be algebraic numbers (not necessarily in K) of 
heights //(ri) =?!,.••, B(^m) = Qm- Let d{R) denote the index 
of R{zxy . . . ,Zm) at the point (fi, . . . , fm) relative to n, . . . , r^. 
Our object in the present section is to obtain, under certain condi- 
tions, an upper bound for B{R) in terms of B, qi, . . . , qmj ri, . . . , Vm- 
We therefore define 

<7l, . . . , gmi ri, . . . , Tyn) = SUp l9(/?), (18) 

the supremum, or least upper bound, being taken over all R in 91 
and all integers fi, . - - , fm of heights qij . . - , qm, respectively. 

The double significance of ri, . . . , in the definition (18) should 
be noted; these numbers occur both in the definition of the index 
and in condition (b) above. 

We proceed by induction on m. In Theorem 4-10 the case m = 1 
is treated, in Theorem 4-11 there is given a recurrence relation 
between e,„_i and 0m, and in Theorem 4-12 an explicit bound is 

obtained. 


Theorem 4-10. 

0i(i?;gi;ri) < 


3.V(A^ + 1) 
log qx 



N log B 
r\ log qx 


Proof: Let the defining polynomial of fi be 

x(2i) = doZx^ + dhy do ^ Oj 

where do^ . . . , dh are relatively prime rational integers, so that 

\\-^\\ = K(fi) = <7i = max (ldol> • • • » I^^D- 

Each polynomial R in has integral coefficients in K; regarding 

these coefficients as polynomials in a single primitive element we 
can obtain other polynomials from R by successively replacing this 
nrimitive element throughout by its various conjugates. Let 
be the product of these N polynomials. By the Symmetric Function 
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Theorem, R* has coefficients in Z. Also, deg A* = Ari, and by 
Theorem 4-5, 

\\R*\\ < (1 + 

By the definition of the index, R{zi) is divisible by {zi — 
and the same is therefore true of R*{zi). Since R*{zi) has coeffi- 
cients in Z, it is divisible by One consequence of this fact is 

that hrid < Nri, Also, it follows from Theorem 4-3 that 


and, by Theorem 4-4, 


9/'" = llxir*® < (hr^e + 

< {Nri + 

Hence 


e < 


i) log 




log Qi 


+ 




- 


ri log^i 


and the theorem follows from the fact that log 12 < 3 


Theorem 4-11. Let p > 2 be a positive integer, let ri, . . . , rp be 
positive integers such that 


Tp > 105 - 1 , ^ > 6 - 1 , 

rv 




where 0 < 6 < 1, and let qi^ . . , , he positive integers. Then 

Qp{B\ gi, . . . , gp; ri, . . . , T-p) < 2'max (<1> -|- # + 6^), (20) 

where the maximum is taken over integers I satisfying 


1 < ^ < rp + 1, (21) 

and where 

<!> = ei(M; qp-,lrp) + Qp_y{M ■ , qp _, ; Ir^ • • • , (22) 

and 


^ = (^1 + 



Proof: Let R{zi,...yZp) be any polynomial of the class 
3(p{B f ri, . , . , Tp) and lot fi, . . . , fp be algebraic numbers of heights 
gij • • - 7 Qp respectively. Then R satisfies the hypotheses of Theorem 
4-8, so that there are numbers I and p and a polynomial F( 2 i, ... ^ ,) 
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having the properties listed there. By Theorem 4-8, 


r^l < ((ri + 1) ■ • • (rp + l)) 2 ' 22 (’-i+"-+’-p>'n 2 £ 2 i 


and hence 

[F] < (ri + i)2pi22npii^2^2i _ 


since ri > r 2 > • * * > by (19). From the factorization 

F (^i, . . . , 2p) U j — 1 ) ^ (^p) 

and the fact that the arguments of U and V are disjoint, it follows 
that also 

[U\ <M, lyi < M. 


The polynomial C/( 2 i, . . . , Zp—\) has degree at most Ifj in Zj, for 
j = 1, . . . , p — 1. It is therefore an element of the class 

Iri, . . . , Irp-i). 


Hence, its index at (fi, . . . , fp—i) relative to . . . , Ifp^i is at most 

0p — j (Jlf^ , ^1, . . . , Qp — 1 , ITij . . . , IVp — 1 ). 

It follows from the definition of the index that the index of U at that 
point relative to ri, . . . , rp_i is at most 

lOp — 1 I Qi} • • • j Qp — 1 j * ' • » ^^p — l)' 

Similarly, V(zp) is an element of the class Itp), and its index 

at fp relative to Vp is at most 

lBi{M\qp) Ivp). 


By the last sentence of Theorem 
(fi, . • • » fp) relative to ri, . . . , Vp is 


4-9, the index of F = VV at 
the sum of the indices of U and 


F, so that 


index F < Z^>, 



where 4> is defined in (22). r ^ • 

We now deduce from the determinantal representation of F m 

equation (10) a lower bound for the index of F in terms of the index 

B of Consider first any differential operator of the form 

1 / / d 

^ ! • • ■ ^p-i • \^Zi/ \dzp^i/ 

= Zl + ' ■ ' + tp—i < ? ” 1* 


of order 



4r41 

If the polynomial 
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does not vanish identically, its index at (fi, . - . , fp) relative to 
ri, . . . , Tp is at least 



Now 


w 


rv-\ 


< 


I - 1 


< 


Tv-l 


< 5, 


by the inequalities (21) and (19). Hence, since the index is non 
negative, it must be at least 


max 





If we expand the determinant on the right side of (10), we obtain 
for F a sum of l\ terms, a typical term being 

=b/3(AMo^) ^ 

where Apo, . . . , are differential operators on 2 i, . , . , Zp_i whose 
orders are at most I — 1. By Theorem 4-9, the index of such a term, 
if it does not vanish identically, is at least 


1 


i—i 


{I - 1)! \dz 


R 


i-i 

^ max 


(o, , - - «, 


Since F is a sum of such terms, it follows from Theorem 4 


index F > YL niax 

..=0 

We may suppose that Srp > 10, since otherwise 

d < lOrjT^ < « < 26* 




and the desired inequality for e then holds. Under this supposition, 
[6rp]* > 20^rp*/3. Hence if drp < 1, wg have 
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1-1 

^ max 
*'=0 



\ l^pl 

) = rp-^ E (Orp - v) 
/ =0 

> h'rp~\erp\^ 

> ¥^rp, 


while if dVp > Z, then 


Hence 


z-i 

max 

=0 




> -le. 
~ 2 


index F > min (^Wj \Tp6i^) — U. 



Combining (24) and (25), we obtain 

min {\ld, ^Vpd^) < Z(<1> + 5). 

Thus either 0 < 2(# + 6), in which case 6 satisfies the desired in- 
equality, or 

+ 5) < (Tp + 1)(4> + 5). 


Since rp + 1 < 4rp/3 by (19), this gives 

0 < 2(<I> + 5)* < 2('ti + i^), 


and the proof is complete. 

Theorem 4—12. Let m be a positive integer j and suppose that 


0 < 5 < 


1 


m2”^{N + 1 ) 


Let ri, . . . , be positive integers such that 


> 105“^ 




O' 


> 5 


— 1 


for j = 2 , 


m 


Let gi, . . . , be positive integers such that 

log q\ > 25“^ m(2m + 1), 

rj log qj > ri log gi, for j ^ 2 , . . . , m, 

logqi > ^Sr^NiN + 1 ). 


Then 




9'm ; Tj , 


r„) < 


(26) 

(27) 

(28) 

(29) 

(30) 

(31) 
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Proof: The proof is by induction on rn. For w = 1, we appl}^ 
Theorem 4-10, together with the inequalities (80) and (26), and 
obtain 


, 3y(N +1) A' log , 

< — r ^ \ < (A' + i)« < 105% 


log 9 


ri log (ji 


which is the desired ineiiuality. 

Now suppose that p > 2 is an integer, and that the theorem holds 
when m = p — 1. When m = p, the hypotheses of the present 
theorem are more stringent than those of Theorem 4-11, so that the 
latter is applicable here. We must estimate AI and 
We have 

M = (ri + l)^P^2^npll^2j^2l ^ l)2p22nP/2^j25riy^ 

Since Z<?*p + l<ri + l< 2'”^ it follows that 

By (28) with 7n = p, we have 4p + 2 < 5p“^ log gi, so that 

M < g/i^’-h 

where 5i = 26(1 + p~^). (32) 

Thus 01 (M ; ; Ir,,) < 0, ; Ir^,) (33) 

and 


Qp-i{M] gi, . . . , gp_i; Iri, . . . , Pp^i) 

< 0p_i(g/‘^''^ 9i, . . . , gp_i ; /ri, . . . , Zrp_i). 

Moreover, (32), together with the inequality (26) with m 
implies that 


(34) 


= P> 


5i < 


1 + p 


— 1 


< 


1 


p2P-i(Af + l)2 (p - i)2P-‘(Af + 1) 


(35) 


In particular, {N + l)6i < 6i=. 

It follows from (30), and the fact that qp > qu that 

log gp > 36~^N{N +1). 

Hence by Theorem 4-10, the right side of (33) does not exceed 


5 + 


NSilri log qi 


K log Qp 
here we have used (29). 


< s + N&i < (N + l)3i < 
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To estimate the right side of (34), we use the induction hypothesis, 
that the theorem holds when m = p — 1. The conditions of the 
theorem are satisfied for m = p — 1, if we replace 8 by 5i and 
ri, . . . , rp_i by Zri, . . . , lrp_i ; since 5i > 5, this is obvious for all 
the relations but (26), which has already been verified in (35). It 
follows that 

Gp-I ; ?! 9p-i',lri, . . . , Irp^i) < . 

Hence, since 5i < 45, the two results just proved imply that 

4> < 26^ + 

Finally, (20) gives 



Qp J ^1) • • • » ^p) 

< 2{3(10^i5'i>’"‘) + + 5*1 


< 2 






< 10'’5<*^’’. 


4-5 A CO 


lemma 


Theorem 4-13. If n, . . . , r„ are any positive integers, and X > 0, 
then the number A„(X) of sets of integers ji, ■ ■ . ,jm which satisfy the 

inequalities 

0 < ^ ri, • • • J 0 ^ jm ^ Tmj 


h + ...+h^ <\{rn-X) 

7*1 2 

does not exceed 

27n*X"‘*(ri + 1) • • • (rm + 1). 

Proof: We proceed by induction on m. The theorem holds for 
TO = 1, since the number of integers ji such that 

0 < ii ^ »'i> ii < ~ 

is at most ri + 1, and is 0 if X > 1. ^ „ i ■ .Uon 

Now suppose m> 1. The result is trivial if X < 7m^, since then 

the conditions on the individual j’s give an improvement of the desired 

upper bound. Hence we may suppose that h > 2m*. If we fix jn,, 

we must count the sets of integers ji, . . . ,jm-i such that 
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Putting 

s 


0 ^ Jl ^ Ti, • • • j 0 ^ jm — 1 ^ 


ii 

ri 



rm-i 








(m — 1) — 


or, what is the same thing, 

X^ = X^(jm) = X — 1 + 



we see that 

■4m(X) = S 1 (^^(jm))- 

Jm*0 


By the induction hypothesis, 

Am(X) < 2(771 - l)i(ri + 1) • ■ • (r„_i + 1) E (x - 1 + — ) , 

y=o\ ^m/ 

and it suffices to prove that 


E fx - 1 + ' < X-i(m - l)-^77ii(r + 1) 

y-o\ r/ 


for all positive integers r and m, if X > 2m^. 

If r is even, we put j + k and obtain the sum 





+ E 

b" 

+ E 2X 

fc-1 



r 


— 1 


+ X 



— 1 



b 

+ 2x 5: (x^ - 1) 




= X'^ + 2X'-' E (1 - 

k^l 

^ X-‘(r + 1)(1 _ x- 2 )->. 

t 

Since 1 — > 1 — nr^/4 > (1 — we have the desired 

inequality. 
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If r is odd, we put j = (r— l)/2 k and obtain the sum 


i(r+l) 

E 

A=_i(r-1) 


X + 


2k - 



— 1 


i(r+l) 

= E 

A:=l 


X + 


2k - 



— 1 


+ X - 


2k - 



— 1 


^(r+l) 


9 (2A: - 1)^\ 

- — ?-) 


2 \ — 1 


< X(X" - irHr+ 1), 


and the result is as before. 


4-6 The approximation polynomial. Let a be an algebraic integer 
of degree n > 2 over K, so that a is a zero of a polynomial which has 
integral coefficients in K and which cannot be factored into a product 
of such polynomials of positive degrees. Let L = K{a) be the field 
obtained by adjoining ot to K, Finally, let oji, . . . , be an integral 

basis for iC, and put 

= bi, max (f^, . . . , = h- 

In the remainder of the proof we shall be concerned with a single 
set of values of m, 5, ?i, fi? • • * » Q'm, ri, . . . , which will be 
chosen later in the order just specified. The choice will be made so 
as to satisfy the following conditions: 

0 < 5 < m-^2~^{N + l)“^ 

lOmg(i)” 2(1 + 38)nm^ < — » 


Vm > 105 


— 1 




for y = 2, . . . , m, 


52 log > 2w + 1 + log (61 + 1) + 

rj log qj > ri log Qu for j = 2, . . . , m, 

log qi > + 1). 


(39) 

(40) 

(41) 

(42) 


Notice that these conditions imply those of Theorem 4-12, since (37) 
and (40) together imply that 5 log qi > 2m ( m ). 
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Define X, n, v, Bi the equations 

X = 4(1 + 2>&)mn'‘, 
p; = i(,n - X), 

r, = 

Th = 


Then (38) is equivalent to 
Also, 


rt < fi. 



(43) 

(44) 

(45) 

(46) 

(47) 


since -s/x < X — 1 < [x] for all x > (3 + \^5)/2, and 

g&n > ^ ^(2m+l)ri ^ ^ ^30^ 

We come now to the main lemma, which will be the only one to 
which reference is made in the eventual proof of the Thue-Siegel-Roth 
theorem. 


. Theorem 4-14. Suppose that the conditions (37) through (42) are 
"‘isaiisfiedj and suppose that fi, - . . , algebraic numbers of 

'heights gi, . • • , respectively. Then there exists a polynomial 
Qi^u • • • » 2 m) with integral coefficients in K and of degree at most 
rjinzj.forj = 1, . , . , w, such that 

(a) the index of Q at the point (a, , . . , a) relative to r\y . . , ^ rm is 
at least p — vi 


(b) (2(ri, . • . , fm) 9^0; 

(c) for all derivatives 




where fi, . • • , im are non-negative integers, the inequality 

lQ.V..i„(2l, . . . , Z,n)\ < B}+"‘(1 + \zi\y^ ■ • • (1 + 

holds, and the corresponding inequality also holds if the coefficients in 
Q are replaced by their respective field conjugates. 

Proof: Let c^, . , . , cn range independently over the non-negative 
rational integers not exceeding Bi, and let C be the set of integers of 
K of the form 


CiOJi + • • • -f 
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The number of elements of C is (1 + and if we put 


(1 + ri) • * • (1 + r^n) = r, 

there are 

(1 (48) 

distinct polynomials 

P{ziy . . . , Zm) = H S • • • » Sm)Zl'^ * * ’ 

«i=0 am=0 

whose coefficients 7 (si, . . . , belong to C. For 7 (si, • * - , Sm) in C, 


l7(si, . . . , s„.)l < b2BiN, 



and if we put 



then 




|Py,...yJ < 2’'>+-+’'’"625,iV < b2N2"'^'Bi < b2NBi^+‘, 

since mri log 2 < log qi by (40). Now replace all of 2i, . . . , Zm 
by a. Since the total number of terms is at most r, and since, by (40), 

r = (n + 1) • • • (r„ + 1) < 2’'‘+-+’'’" < {bi + I)””'* < Bi‘, (50) 


we obtain the bound 

< biNBi^-*-^^. 


Let d be a primitive element of L, so that L = «(«?). Order the 
conjugates of d so that t?i, • • ■ , are real and and ’^n+OTri-y 
complex-conjugate for v = 1, . . . , P 2 ) so that pi -f- 2 p 2 — n • 
be a fixed one of the numbers , a), where ji, ... ,3m 

satisfy the inequalities 


0 <ii < ru 


0 ^ jm 


h + ... + i^<p. ( 51 ) 

ri Tm 


Then f can be written as a polynomial in d, with rational coefficients. 
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and as such has field conjugaft's = 1 nX. Hence we can 

define nX real nund)ers . . . . hv the eciuatiDiis 




s 




tor V = 1 , . . . , Pi , 

for Pi + 1 < c < />! + pv 


('ollecting them in a fixed order for fixed coefficients • • • » 

and for all ju . . . Jm satisfying the ineiiualities (fil ), we have a set 
of numbers which can lie considered as coordinate^ of a point; by 
Theorem 4-13 there are 

M < 2nXwh-h' 

coordinates, and each is numerically smaller than [h-^X + l = t. 
Thus all the points, for the various sets of coefficients in (\ lie in a 
cube of edge 2t in ^/-dimensional space. If each edge is dividetl into 
3^ eciual parts, we get (30 '^ subcubes of edge 3. By (48), if 

(I + > (3/.)-^', (52) 

there are more points than subcubes, and the points corresponding 
to two different polynomials /'^*(2i, . . . , 2;„) and , z,n) he 

in the same subcube. If we put 


P{Z\, • . • . 2m) = , Z,n) - , 2,„), 


then 


_ 9 


• • • ,«)i < V2 ■ = < 1 


for ji, . . , ,jm as in (51). Since Pjy..j,^{a, , . . , a) is an algebraic 
integer whose norm is numerically smaller than 1, it must be zero. 
Hence the index of P at the point (a, . . . , a) relative to , r„, 

is at least g. Also the coefficients 7(si, . . . , s,„) in P are integers of 
K, not all zero, such that the relation (49) holds. 

To verify (52), notice that by the inequality (40), 

91*’'* > ib2N, 

and hence 

J5i > 41»2^) 

> (4!>2iVB,)Uv^ 

/i, '’" > + 3)hv-'(i+35)-‘, 

(1 + > (30 
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m 


We now apply Theorem 4-12, the hypotheses of which are 
satisfied, as was noted earlier. Since P belongs to the class 

; ri, . . . , Vm), its index at (fi, . . . , ^'m) relative to ri, . . . , r 
is less than rj, defined in (45). Hence P possesses some derivative 

QiZu \ . . . (— 

^1^- ■ ' • kml \dZi/ \dZm/ 

with 


such that 




• • • , fm) 7*^ 0. 


The index of Q at the point relative to ri, . . . , is at 

least tx — r). Thus Q has the properties (a) and (b) of Theorem 4-14. 
From the relations (49) and (50), 

Hence for an arbitrary derivative, 

Finally, 


I "* 

|Q.l---m(Zl, • • ■ , Zm)\ < b2NBi^+^^ n (1 + \zy\ + • • • + IZvl’’”) 




m 


< b2NBi^+^‘n (1 + KW’ 


< n (1 + 

,' = 1 

since b 2 N < Bi^ by (40). The same inequality holds for the con- 
jugate polynomials, and the proof is complete. 

4-7 The Thue-Siegel-Roth theorem 

Theorem 4-15. Let K be an algebraic number field of degree N, 
and let a be algebraic. Then for each x > 2, the inequality 


m 


a 


- fi < 


1 




}ms only finitely many solutions f in K. 


(53) 
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Proof: We shall suppose that the theorem is false, so that (53) has 
infinitely many solutions, and produce a contradiction. We may 
suppose also that a is an integer. For if not there is a positive 
rational integer a such that aa is an algebraic integer, and for each 
solution f of (53) we have 

a 

^ {HinY - {H(anY ' 


Hence for arbitrary 
ciently large, 


6 > 0 , and for all solutions f with H{^) suffi- 


and € can be chosen so small that « — e > 2 . 

Finally, it suffices to prove that (53) has only finitely many solu- 
tions in primitive elements T of K. For an algebraic number field 
has only finitely many subfields, and every element of iv is a primitive 
element of some one of its subfields; moreover, the inequality in ques- 
tion does not depend on the degree of a over K, 

We first choose 771 so large that m > 4nm^ and 


m 



< 


(54) 


which is possible since x > 2. For sufficiently small 8 we have 

7n — 4(1 + 35)nm^ — 217 > 0, 


where 77 , given by (45), becomes arbitrarily small with 5. This 
condition is the same as that of (38). We choose 5 to satisfy this 
and the inequality (37), and finally the inequality 


2m(l + 8) + 25N{2 + 55) 
m — 4(1 + 36)n7n^ — 27j 



which is possible in view of (54). The inequality (55) is equivalent to 


m{\ -b 6) + 5A^(2 + 55) 

M - 77 


< 



by equations (43) and (44). 

Having chosen m and 5, we now choose a solution of (53) (a 
primitive element of K) with H{X{) = and with so large as to 
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satisfy (40) and (42). We then choose further primitive solutions 
• • • » of heights (72, • • - , ?m, such that for 7 = 2, , w, 


log g; ^ 2 

log qj_i 8 

We now take Vi to be any integer such that 


ri > 


1 0 log q,„ 
8 log (7i 


and define rj, for j = 2, , m, by 


ri log ?i r, log <71 

< rj < — h 1 


log qj 


log <lj 


Then the inequality (41) is satisfied. Also, 


(57) 


(58) 


(59) 


rj log <7j ^ ^ log qj ^ I log (Jm 

ri log qi ri log 7, ~ n log <7, 

by (58). The conditions (.39) are satisfied, since 


< 1 + 




and 


r, log f/i 


Tm > > 106 -> . 


logg 


m 



— 1 


1 + 


10 


> 6 


— 1 


by (59), (60), and (57). 

We know from Theorem 4-14 that there exists a polynomial 
Q(2 i, . . . , 2m), whose properties are listed in that theorem. Let 
fi, . . . , in 7^ be zeros of irreducible polynomials of degree with 
relatively prime coefficients in Z, the coefficients of being 
ki, . . . y kmy respectively. Then the number 

^ • • • I 

is an element of K. If the field conjugates of f,- are f,', , for 

z = 1, . . . , m, then is a sum of products of powers of the f,” 
with integral coefficients from K, and in each such product a factor 
occurs to the power r» at most. In the proof of Theorem 2-21, 
it was shown that the product of ki and any set of distinct conjugates 
of fi is an algebraic integer. For each iy the field conjugates of fi 
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are distinct, because is a primitive element of K. It follows that 

is an algebraic integer, and since it is also rational 
it is a rational integer, so that 

> 1. (61) 


On the other hand, we have 



ri 

z 

n=0 


Tm 


Z * 

im=0 


a)(fi - «)' 


(r 


m 


and, by part (a) of Theorem 4-14, the terms with 




all vanish. In all other terms we have 



• ■ • qm^) 


— X 


u . . . 


rmlr\\imlTm 




< • ■ • 91*’"'’’’") 


irn I 




since > qi by (41). Hence, using part (c) of Theorem 4-14, 

we have 

M < (ri,+ 1) • • • (rm + l)Bi'+"'(l + bi)’"^i9r^i 

and by using part (c) again, together with Theorem 4-2, we obtain 

Ifcl’’* • • • fcm’’’"N^9l < 5jl+5«5j-n(M-’))*5^(W-l)(l+35) 

m ( N 1 rj 

X n Ui n (1 + 

t=i [ J=i J 


m 


Now, by (50), 


^^a^Ci+55 )^ Yl 

t =1 


0jV(nH H-m) ^ <; 


so that 




< qi 


6Nni2 +63) -f mn(l +«) -nij* -v)ft 
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This, together with (61), implies that 


or 


dN{2 + 55) + m(l + 5) > (/i - r))x, 


K < 


w(l + 5) + dN{2 + 55) 
> 

^ — V 


which contradicts (56). This completes the proof. 

4-8 Applications to Diophantine equations. The Thue-Siegel-Roth 
theorem will now be applied to show that a rather large variety of* 
Diophantine equations have only finitely many solutions. 

Theorem 4-16. Let U{x, ij) be a binary form of degree n, without 
multiple linear factors^ whose coefficients belong to an algebraic 
number field Kq of degree h. Let x and y be integral variables of Ko‘ 

Suppose that 

n > 2h. 

Let V (Xj y) be any polynomial of total degree v < n — 2h which has 
coefficients in Kq and has no common factor with U (x, y). Then the 

equation 

U{Xy y) = T(x, y) (62) 


has only finitely many solutions. 


Proof: Just as in the representation theory for binary quadratic 
forms, it makes no difference whether we consider (62) or an equation 
obtained from it by a substitution x = ox' + by\ y = cx'^+ dy , 
where a, b, c, d are in Z, and \ad — bc\ = 1. If lT(Xj y) = oqX + * * * 

+ an2/"j then 


U{Xy ax + y) = U (1, a)x" H + 

• * 

U(x + by,y) = Oo^" H b Uib, 

Choose a in Ko so that UQ,a) 9^0, and put U (x, ax + y) = 
Ui(x,y). Then choose b in Kf, so that t/i(b, 1) 5^0, and put 
Uiix + by,y) = U 2 {x,y). Dropping the subscript, we see that 
there is no loss in generality in supposing that the coefficients of 
x” and 1 /” in U (x, y) are different from zero, and we can write 


U(x, y) = n (z - ^*) . 

ik-1 \2/ / 


(63) 
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where neither a nor any is zero. By assumption, the numbers ft 
are distinct, so if we put 

Cl = min {\^j - 

j 

then Cl > 0, and for every x and y, at least n — 1 of the factors in the 
product occurring in ((>3) have absolute values not less than ^Ci. 

Let X = V and ?/ = f 5 *^ 0 be integers of Kq, with field conjugates 
. . . , . • . , Then as we saw in Theorem 2-5, 

h 


where Q(i) 
1 < / < h. 


n = (QWY, 

>=i 

is an irreducible polynomial with coefficients in Z, and 
Let M = max (fFl , ITI ), and name the conjugates so that 

h// 

Q(i) = n 

> = i 


Then the coefficients of Q{i) are numerically smaller then the cor- 
responding coefficients of 

n {Mt + M), 

3=1 

so that llQll < (2M)«/. A fortiori, H(v/^) < (2M)"'C 

Now by Theorem 4-15, there are only finitely many solutions of 
the inequality 




1 

f 


< 


1 




2+€' 


for fixed e' > 0. Hence for M sufficiently large, and e = e'/i, 




> 


1 


> 


1 


- (2A/) 




at least if the left side is not zero. This is certainly true of the solu- 
tiohs of (62), since Uix^y) and V(xjy) have no common factor. 
The same argument applies to the numbers and for 

we see that for e > 0 and M sufficiently 

large, the inequality 


h — 


U) 


> 


1 


(2M) 


2h-\-t ^ 


j = l, .. . ,h-, A: = 1 


• « 




holds for every solution of (62). There is no loss in generality in sup- 
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posing that M = = |f|, since (62) remains correct after replac- 

ing all quantities by their conjugates and if necessary interchanging 
X and y. Hence, for large M, 

On the other hand, there is a constant C 2 , depending only on the co- 
efficients of F, such that 

f)| < c^M". 

If we choose 6 < n — 2/i — j/, then for sufficiently large M, 

> \v(v,n\. 

But a bound on M implies a bound on the integral coefficients of the 
polynomials defining 7 ) and so that there are only finitely many 
solutions of (62). 

Corollary. If U{x, y) is a binary form of degree n > 2, with 
coefficients in Z and without repeated linear factors, and if a 9 ^ 0 is a 
rational integer, there are only finitely many rational integral solutions 
of the equation U{x, y) = a. In particular, the equation 

ox” + by^ = c 

has only finitely many solutions in Z if a, b, and c are in Z, abc 9 ^ 0, 
and n > 3. 

This follows immediately from the theorem, with Kq = R, h = 1, 
and n — 2/1 > 0. The special case mentioned includes the higher- 
degree analog of Peirs equation, x” — dy" = N. 

In the above considerations, strong use was made of the homogeneity 
of U{x,y). If a Diophantine equation is not of the form specified in 
Theorem 4-16, it may still be possible to relate its solvability to that 
of one of this form. We now consider such a case. 

4-9 A special equation. It was conjectured by E. Catalan in 1842 
that 8 and 9 are the only two consecutive integers larger than 1 which 
are powers of other integers. This has never been proved ; it has not 
even been shown that no three consecutive integers are powers, 
although it is trivial that no four can be, since one must be of the 
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form 4/.' + 2. In .slightly diifereut terms, the problem is to show that 
the Diophantine ecpiation 

a-" - = 1 (04) 


has no solutions with u' and z larger than 1, except for that men- 
tioned. Various special cases ari.se by fixing, or specializing in some 
other way, one or more of the variables in (04). The case we are now 
going to examine is that in which the exponents are fixed, so that we 
consider the equation 

X*" - = 1. (65) 


Catalan’s conjecture would be proved if it could be shown that for 
each pair of integers m and n larger than 1, (05) has no positive solu- 
tions except that mentioned. Since this seems to be unfeasible, we 
consider the more modest question of whether (05) can have infinitely 
many solutions. This, at last, is a question that can be answered. 
It is a very weak consequence of the following theorem, due to 
Mahler, that (05) has only finitely many solutions if ?n > 2, n > 3. 

Theorem 4-17. Suppose that m > 2, ii > 3, ah 9^ 0, (.r, y) = 1, 

Then as max (l-rl, jyj) — > oo , the greatest prime factor of 

ax"* + hy'^ 

tends to infinity. 

Since = 1 has only the obvious solutions x = dbl, y =0, 

the new problem is completely solved. Unfortunately Mahler’s 
proof, which depends on a p-adic version of the Thue-Siegel theorem, 
cannot be included here. We can, however, obtain partial results of 
some interest. 

If mn is even, the fact that (65) has only finitely many solutions 
is a consequence of the next theorem, which is a special case of a 
theorem proved anonymously and published by L. J. Mordell. 


Theorem 4-18. Let f{x) be a polynomial of degree n > 3, with 
coefficients in Z and with distinct zeroSj and lei a be any nonzero 
rational integer. Then the equation 

= fix) (66) 

has only finitely many solutions x, y in Z. 

Proof: Suppose that 

/(x) = ao(^ ~ fl) • ' • (x — $n)) 
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and that (66) has infinitely many solutions. The numbers ay = ao^y, 
for^ = 1, . . . , n, are algebraic integers, and if (66) holds, then 

= (<^o^ ~ "i) * ' ■ ~ “«)■ 

Let X = , fn) be the splitting field of /. Any ideal in K 

dividing \a^ — a,] and [a^x — ay] also divides [ai — ay], so that the 
norm of such a common divisor is a divisor of the discriminant d of /. 
Hence, if F is a prime ideal divisor of y and NF > d, then for some i, 
F^|[aoX — a*]. Since there are only finitely many ideals with norms 
smaller than d, and only finitely many divisors of ao”“*a, it follows 
that for each z, 

[aox - ad = BiCi^ (67) 

where Bi and C,- are ideals, and F* runs over a finite set of ideals. 

Let D run over a fixed system of representatives of the various ideal 
classes in K; the number of D^s is finite. Then for each i and some 
D, Ci ^ Z), so that 

mCi = [5]D, 

for some /3 and 5. We shall show that p can be chosen from a finite set 
of integers of K. Let ([i?], [«]) = E, and put [d] = EF, [5] = EG. 
Then EFCi = EDG, whence FCi = DG; thus F\D, and F is one of a 
finite set of ideals. By Theorem 3-2, there is an H with norm le^ 
than c (so that H is one of a finite set) such that FH = [ 7 ] is principal. 

Thus 

[y]Ci = (GH)D. 

Since C,- ~ D, also [ 7 ] hence GH = [f.], and 

[y]Ci = [U]D, 


where 7 is one of a finite set of integers. 

By (67), 

[y^][a(jX — a,] = , 

from which it follows that BiD^ is principal, say B.D* = [n.]- Thus 

* . 


for some unit €», 


yi^ia^x — oti) = . 

By Dirichlet’s theorem on units, can be written as */*/'*, where 

e/ is one of a finite number of units. Finally, fort - 1, . . . , n, 

* \ 2 

aoX — a** = xihi , 
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where Xi, . . . , X„ are integers of K, and xi, . . . , are certain ones 
of finitely many numbers of K. Hence 

~ ^2 — ^ b, 


^2^2^ — ^3^3^ — <^3 — <3^2 ^ fb 

^3^3^ — ~ 7^ 0. 

Now let L = V^, V^). Then, in L, 

(Xi\/^ — X2'x/^) (Xi x/^ + X2'\/x2) = 0;2 — OL\j 

and since the denominators of xi and k 2 can be taken to be bounded, 
it follows that 

XiX/^ — X2'\/^ = /33€3^, 


where jSa is one of finitely many elements of L, es is a unit of L, and 
I > 1 is an arbitrary positive integer. Similarly, 





> 


But then 




If there were only finitely many distinct ratios ci/es, there would be a 
finite set of coefficients <p such that 

y/aox — ot2 — x/aox — as = v>(\/aox — ^ — x/oox — ot2) 

for every solution x of (66) and for suitable determination of the 
radicals. This is clearly impossible, so (68) must have infinitely 
many solutions in integers ti/es, ^2/^z of L. But for I sufficiently 
large, this is in contradiction with Theorem 4—16. Hence the sup- 
position that (66) has infinitely many solutions is not tenable, and the 
proof is complete. 

Returning to equation (65), we see that the only possible solutions 
have a: = 0 or ±1, if (m, n) > 1. For the problem that remains, it 
suffices to consider the case in which m = p and n = g are distinct 
odd primes. This was treated by M. Newman, whose work was not 
published. A slightly strengthened version of his result, obtained by 
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applying Theorem 4-16 rather than the analogous consequence of the 
Thue-Siegel theorem, follows. 


Theorem 4-19. // p and q are distinct odd primes such that 

q > 2{p — 1 ) and q does not divide the class number of the cyclotomic 
field Kp = R(^)t where f = expifliri! p), then the equations 

= d=l (69) 

have only finitely many solutions x, y in Z. 


Proof: We carry out the proof only for the equation x^ — y^ = 1 ; 
the alternate case requires only trivial modifications. Put 1 — f = tt 
and [tt] = P, so that P is a prime ideal of Kp, by Theorem 3-6. Let h 
be the class number of Kp. 

If X and y satisfy (69) with the plus sign, then 

[x - l][x - f] ■ • • [x - fP-M = [ 2 /]«. (70) 

Put 

Drs = [x - r, X - r] for 0 < r < p - 1, 

0<s<p — 1, r 7 ^ s. 

Then 



[x - r" - r] = [^ - 1 + 1 - r - n 





TT, TT 



= [x — I, tt], 

since (f^ — fO/(l — f) ^ if *“ ^)- Thus Dra is the same 
for all r and s, and, since Dra\P and P is prime, either Dra = [1] 

= p. We consider the two cases separately, 
jf = [ 1 ]^ then the ideals [x — f''] are pairwise relatively prime; 

since their product is a qih power, there are ideals Aq, ... j Ap^i 
such that 

[x — n = r = 0, . . . , p - L (71) 

Suppose that is the smallest positive integer such that .4/'' is princi- 
pal; by Theorem 3-4, er\h, and by (71), er\q. But g is prime and g}h, 
so Cr = 1 and Ar is principal. Hence there are integers a and P and 
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units 6 and e' of Kp such that x — 1 = ea® and x — f = e'^”, whence 

TT. (72) 




«-««« = 


By Theorem 2-45, the units of Kp have a finite basis, so that eacii 
unit has a representation ■ € 2 ®. where ci is one of the finite number 
of units obtained by taking products of powers of the basis elements, 
with exponents non-negative and smaller than q. Thus (72) implies 
that one of the finitely many equations 

— «i(«2a)® = ir (”3) 

must hold. But for each choice of ej and e/, (73) has only finitely 
many integral solutions < 2 “, ^2 in Kp] this is evident from Theorem 
4-16 with Kq = Kp, h = V - I, n = q > 2{-p - \), V = 0. Hence 
X, and therefore also y, has only finitely many possible values. 

The proof for the case Dr^ = P proceeds similarly. We put 
X — 1 = TTU) and y = where w and z are integers of Kp with 
[tt, z] = [1]. Then (70) becomes 

1^1 [in + • • • [in + 


and since the ideals on the left are pairwise relatively prime, there is 
a t with 0 ^ t ^ p ~ 1 such that 


pmq—p 




Thus there are ideals Aq, . . . , Ap^i such that 






= A?, for 0 < r < p — 





r 9^ t. 


As before, it follows that all the ideals Ar are principal (for r *= t, use 
the fact that an ideal equivalent to a principal ideal is principal). 
Since p > 2, there are distinct rational integers r and s different from 
t such that 0<7*<p — 1, 0<s<p — i. Then for integers a and 
/3 and units e and e of Kp^ 


w + 
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yr _ 

e'0^ - ea^ = ^ _y » 

and the expression on the right is not zero. The earlier reasoning 
shows that the theorem is also true in this case. 

PROBLEMS 

1. Extend Theorem 4-18 to the case that / may have multiple zeros, but 

has at least three distinct zeros of odd orders. 

2. Deduce from the finiteness of the number of solutions of (66) that as 

the integral variable x tends to infinity, the greatest prime divisor of /(x) 
does also. [Hint: Assume that for infinitely many x, /(x) is a product of 
powers of a fixed finite set of primes, and obtain a contradiction.] 
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CHAPTER 5 


IRRATIONALITY AND TRANSCENDENCE 

5-1 Irrational numbers. One of the oldest results in the theory 

of numbers is that \'2 is irrational; this was known to the Pythago- 
reans in the fifth century B.c. The proof, when suitably generalized 
with the help of the Unique Factorization Theorem, leads to the well- 
known rule for determining the possible rational zeros of a polynomial 
with rational integral coefficients; this in turn makes it possible to 
show, if such is the case, that a given polynomial has only irrational 
zeros. Thus the numbers given implicitly as zeros of polynomials can 
be trivially classified as rational or irrational. 

If a number is given by its decimal expansion, one has only to 
determine whether its digits eventually recur periodically to know 
whether or not it is irrational. For example, the number 

0.1234567891011 . . . , 

whose successive digits are formed in an obvious fashion, is clearly 
irrational, since arbitrarily long blocks of a single digit occur, pre- 
cluding periodicity. Similarly, using the regular continued fraction 
expansion of a real number, one can identify not only the rational 
numbers but also the quadratic irrationalities. (Unfortunately, 
there is no simple algorithm known which singles out the algebraic 
numbers of fixed degree n > 3 in a distinctive way.) 

If a real number x is not given in one of these convenient forms, the 
problem of deciding whether or not it is rational may be decidedly 
nontrivial. It is, for example, not known whether Euler^s constant, 
defined as 

lim (l+" + o^“''‘d ^ ^ 

« \ 2 3 n / 

is rational. Aside from properties of special algorithms, the only 
method available for investigating such questions depends on the 
following observation. If x = a/b is rational, then for every pair of 
integers p and q, the number gx — p is some integral multiple of 1/5, 
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SO that it is impossible to find an infinite sequence of pairs p„ and qn 
such that 

\q\x - Vi\ > \q2X - P 2 I > \Q3^ - PsI > ■ • * • (1) 

More generally, no such sequence can be found for which 

knX — Pn\ ^ 0 for every n, and lim lg„x — pn\ = 0- (2) 

n — ► 

On the other hand, when x is irrational there are infinitely many solu- 
tions of the inequality 

0 < \qx — p\ < - ’ 

\'i I- 1 ^ 

We therefore have 

Theorem 5-1. Each of the follovnng is a necessary and sufficient 

condition for the irrationality of a real number x: 

(a) there are integers pi, p 2 ) Q' 2 ) • • ■ » such that the inequalities ( 

(b) there are integers pij qit P 2 ) ^27 • • • » conditions (2) 

hold. 

As a simple application of this principle, we prove 
Theorem 5-2. The number e is irrational. 


Proof: 


We recall the expansion 



It is weU known that if ao, ai, • • • is an unbounded increasing sequence 
of positive numbers, then the series 

£ (nil! (3) 

Jfc-O CLk 


converges to its sum S in such a way that 


o< S- < — 

k =0 Ofc «n-f-l 


for n '> 0. Hence if we put qn n ! and 


Pn = E 

Jt-0 


(- 1 ) 

k\ 
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then pn and q,, are integers and 



1 1 

- 7/1 

0 < 

PE 

1 

1 

• 


e 



i _ f 

e ,=o A'! 


n ! 


1 


< 


{H + 1)! 71+ \ 


It follows that 1/c, and hence c itself, is irrational. (This is a variant 
of the original proof due to Fourier.) More generally, the same 
argument shows that if the lcm of the integers m, . . . , is o(a„+i) 
as n then the series (3) converges to an irrational number. 

For completeness, we give a proof due to I. Ni\ en that r is irra- 
tional. It is short and simple to follow, but to one unfamiliar with 
older work it must appear completely unmotivated. 

Theorem 5-3. The nuntber tt is irrational. 

Proof: Suppose on the contrary that tt = a, 6, 
integers. Put 

x"(a - bsV 

fix) = — 


where a and b are 


n ! 


and 


Fix) =/(.r) -f"{x) + 


where the positive integer n will be specified later. Now /(O) - 
= • • • = = 0, and if we write 


f(x) = 


ttox” -F aia:"+i • -t- a„x 


2n 


n! 


we see that for n < A; < 2n, 


1 


n 


= — Z (n d- l)(n + I - 1) ■ ■ ■ (n + I - k + 


nl 1=0 

= E 


(n + 0^ 


n \ 1=0 (n + Z k)\ 


aix 




so that 

/W(0) = 

n\ 

Hence g Z, and since f(x) = /(tt — x), also e Z, for 

0 < i < 2n. Finally, F(0) and F(7r) must be integers. 
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On the other hand, 


A 

dx 


{F'{x) sin X - Fix) cos x) = F"ix) sin x + F(x) sin x 

= f(x) sin X, 


so that 



fix) sinxdx = [F'(x) sin x - Fix) cos x]5 = Fiir) + F(0). 


0 


But for 0 < X < TT, 


0 < /(x) sin X < 


TT U 


n! 


- > 


so 


„„ that the above integral is positive but arbitrarily small for n 
sufficiently large. But this is impossible, since F(0) + Fiir) is an 
integer. The contradiction establishes the theorem. 

PROBLEM 

Given a real number x, define the sequence {x,) of real numbers and the 
sequence {ajt} of integers by the conditions 

xi = X - [x], 


[x] = ao, 
xi = h ^ 2 , 

ai 
1 , 

X2 T ^3, 

02 


, 1 / / 1 
where — < xi <« > 

01 Oi — 1 

where ^ ^2 < 7 * 

02 02 — 1 


Xfc = — + Xk+u where i < Xfc < 


1 


Ojb 


ak 


a* — 1 


Thus 



Show that this expansion terminates if and only if x is rational 
that if X has an infinite series expansion 


Show also 


Oi £>2 

where the numbers 6, are integers with b,+i > then 6* 
and X is irrational. 


Ck for all k, 
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5-2 The existence of transcendental numbers. One class of 
irrationals, the algebraic numbers, has been treated in some detail in 
the preceding chapters. We now consider the complementary set of 
transcendental numbers: those complex numbers which do not satisfy 
any rational algebraic equation with coefficients in Z. It is by no 
means obvious that this set is nonvacuous; the first proof, given by 
Liouville in 1844, depends on the fact (see Theorem 4-1) that if a 
is algebraic of degree n > 2, then there is a constant C such that 
the inequality 



has no solution p, q in Z. If a number ^ can be found such that for 
every w > 0 the inequality 

0 < - p| < ^ * 9 > 1, W 

has a solution, then ^ cannot be algebraic of any degree, and must 
therefore be transcendental. 

An example of a Liouville number j for which (4) always has a 
solution, is given by 

« = i: 

*=1 

where a > 1 is a fixed integer and bi, 62, . . . is an increasing sequence 
of positive integers such that 

I . ^/c+l 

hm sup = 00 . 

jfc — bk 

For, given w, there is an n = n(a)) for which bn+i/bn > co + 1, and if 
we put 

q = P = q T. (-l)*a“^S 

*=1 

then p and q are integers, and 

0 < kf - Pi < S < ^- 

It should be emphasized that the condition (4), while sufficient 
for transcendence, is by no means necessary, even for real numbers. 
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For, using a modification of an argument due to Cantor, we can give 
a second proof of the existence of transcendental numbers, and in 
particular of numbers of this kind for which the inequality 





€ 



has only finitely many solutions for fixed e > 0. It is known* that 
there are uncountably many irrational numbers f for which Mi^) = 3, 
where M (?) is the supremum of the numbers X for which the inequality 


g 


< 


1 


\q 


2 


has infinitely many solutions. Hence if the algebraic numbers are 
countable, it follows that there arc nonalgebraic numbers for which 

M(?) = 3. . . V- n 

To order the algebraic numbers, we associate with each ncm- 
constant polynomial P{x) = + ■ • • + with integral coeffi- 

cients the number h{P) = n -\- l«o| -b ■ • • + la„l. T ere are no 
polynomials with h{P) = 1. If HP) = 2, then P{x) = x or x If 
HP) = 3, then P{x) is one of ±x ± 1, ±2.r, ±x , all combma , ions 
of signs being allowed. In general, it is clear that if A- > there 
are only finitely many polynomials such that HP) - A'. Hence a 
polynomials with integral coefficients can be arranged 
first those with HP) = 2, in some order, then those with h{P) 
in some order, etc. Suppose that P, (x), P,{x), ^ • • is such a sequence^ 
Each Pk{x) has finitely many zeros; write down all the ze 
P,(x) in some order, then all those of P^ix) in some order, etc. ^Let 

this sequence be di, 1 ^ 2 , ■ ■ •• Now if ^2 = ^i. delete dal 

or di. delete dsi and in general, if 0, is equal to some 0 with smaller 

subscript, delete 0k. Then the resulting sequence a„ a 2 , • • ■ con- 
tains all algebraic numbers, each just once. ^ , ^pii 

To summarize, if a number can be approximated J 

by rational numbers, it is transcendental, but there are transcendental 

numbers which cannot be approximated even as well as som q 
ratic irrationalities. 

for example. Volume I, Theorem 9-12. 
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PROBLEMS 


1. Show thixt $ is a Liouville number if the partial quotients in its con- 
tinued fraction expansion, 

1 

$ = Go + — • • • , 

Ol + 

have the property that 

log gfc+i 

log ((ai + 1) • • • (a* + 1)) 


[Hint: Show from the recursion relation for the successive convergents 
that Qk < (ai + 1) • • • {ak + 1)> and then use Theorem 2-6.] 

2. Investigate the implications of Theorem 4—15 as regards transcen- 
dental numbers. 


5-3 A criterion for transcendence. In order to obtain an approxi- 
mability condition which is equivalent to transcendence, we must 
replace the linear expresion - p occurring in the inequality (4) 
by a polynomial in 

Theorem 5-4. A real or complex number ^ is transcendental if and 
only if there corresponds to each co > 0 a positive integer n, such that 

the inequality 

0 < Ixo + + • • • + Xn^l < (S) 

has infinitely many integral soluiicms zqj • • • > v^ere 

X = max {\xo\, . . . , \xn\)- 


It is to be noticed that the Liouville numbers (those for which (4) 
has a solution for each w) are precisely the numbers for which we 
can take n = 1 for every oj. In general, however, n increases with w. 


Proof: We first prove that the condition is sufficient. Let a — ai 
be algebraic of degree g^ let f{x) — ao + + ‘ * * + be that 

multiple of its defining polynomial which has relatively prime coeffi- 
cients in Z, with > 0, and let ai, . . . , be its conjugates. Let 
h{x) = xo + xix + • • • + XnX" (x„ > 0) be any polynomial with 
integral coefficients, and with zeros /3i, • • • » /^n distinct from 
«!, . . . , oig. Then 



n m) 




n 


0 


n n {0i — ay) 

t=i 


n n (0, — ay) 

j=i t=i 

^ n (ay) 

■^n j = l 


0 < 
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SO that if X = max (l:ro|, . . . , |x„|), then 


0 < l;i(a)| = 


n 


X/ n /(/3,) 

i-1 


n 


xn" n m) 

»=i 


0 


fle" n \h{a 

;=2 


>)i n ( 

>=2 \ 


X 


( 6 ) 


■ x«-^ 


But 


n 


n m) 

t=i 


is a symmetric polynomial with integral coefficients in the of 
degree g in each /3, and is therefore, by the Symmetric Function 
Theorem, a polynomial of total degree g, with integral coefficients, 

in the elementary symmetric functions — x„_ 2 /xn, • * • » 

=hxo/xn. Hence the numerator in the expression (6) is a positive 

integer, and we have 

\h{a)\ > ;r ^ 


Off” n 

J =2 


h{aj) 

X 




Now if r = f^, then 


h(ocj) 

X 


< 1 + \aj\ + \aj\^ \aj\^ ^l+r + r + 


so that the quantity 


1 


n 

n 

3=2 


hjotj) 

X 


« « 






% • 


^ • 

has a positive lower bound A (n, a) depending only on a and n. Thus • 




A (n, a) 

X«-i 


(7f 


• . • 




yd 1 


It follows that if (5) has infinitely many solutioi«<#l<fc*|i||||l|p^ 
J cannot be algebraic of degree less than o) + 1. Since oj can Marbi- 

trarilv large, f cannot be algebraic. 

The necessity of the condition of Theorem 5-4 is a v^msequenc 

of the following more general theorem. 
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Theorem 5-5. ff are complex tiumbers, then for a 

suitable c which (h pends only on n and t?i, . . . , the inequality 


Tl) + Xid\ + • • ' + 



c 

Oi— 1 ) 



has infinitely many inie(jral solutions To, , t„. 

If ^ is transceiideiital. we can take die = ; since no polynomial 

in ^ vanishes, it follows that (5) has infinitely many solutions if 
n = [2co + 2]. 

Proof: The theorem is trivial if n = 1. For ri > \, put 

c = c'{dij . . . , dn) = 1 + l^^il + • • • + 

let h > 2 be a positive integer, and let Xq', .r/, .... x,/ range inde- 
pendently over the integers from —h to h inclusive. Since each of 
the n + 1 numbers Xk can assume any of 2h -j- 1 \'alucs, there are 
(2/i + = t expressions 

L(i?i, . . . , t?n) = + ^'I'di + ■ ■ • + x„ dji, Ix^- I < h. 

Let these be, in some order, Li, . . . , L/. Clearly 

\Liidiy - . . , dn)\ < ch, 

so that all the points Li(di, . . - ,i9„) lie in the square of side 2c h 
with its center at the origin of the complex plane. Subdivide this 
square into rn^ subsquares of side 2c him each; then if nP" < 
there must be at least one subsquare containing more than one point 
L(t?i, . . . , dn)- We can fulfill the condition nP < i by taking 

m = [(2/1 + - 1. 

. : For this m, suppose that the points 

I *• • 

Li (di, . . . , l?n) = 2:0 + + ■ ■ ■ "F 

* 

' ^(d\y , . . ^df) — -f- ■ • • + X„ 

. lie in a cOi. ubsquare; the distance between them does 

not exceed the length of the diagonal of the subsquare, which is 

2a/2 c'hlm. So if we put xo = Xq' — xq'j ...» Xn = x„' — x„' (so 
that A'' < h — { — h) = 2/i), and 

L( = Li (di, . . . j — L2 (t?i, • • - ) *^n) 

= Xo "F Xjl^j -|- * • ■ — Xndni 
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t?n)l < 


2\/2 ch 


ic'h 




n+1) 


2c 


- < 


2c 





Hence (8) has at least one solution, with c = 2c'. 

If L{0u . • • =0, then xL(i?i, . . . ,t?n) = . . . ,xt?„)=0 

for every integer x, and (8) has infinitely many solutions. In the 
contrary case, choose hi so large that 

. 2c' 

and repeat the entire argument with h replaced by hi. Calling the 
new form thus produced we have, by the analog of (9) and the 

definition of /ii, that 

(t?l, • . ■ J ^n)\ * * * ) 

so that we have a second solution of (7). Continuing the process, 
we can obtain arbitrarily many solutions. 


PROBLEM 

Show that if the numbers t^i, ■ ■ • , t>„ are real, then Theorem 4-5 remains 
correct if the inequahty (8) is replaced by 

c 

\Xo + + * • • -b Xn^nl < ■ 


5-4 Measure of transcendence. Mahler’s classification. In light 
of Theorem 5-4, we make the following definition; a functionjp(n, 1 
is called a transcendence measure for the transcendental number € 
for each n there is a constant c„ such that for every X > I, 


|xo + *1? H + > Cn<p(n, X) 

for each set of integers Xo, • • • , of height X = max l^"l^ 

By Theorem 5-5, any such ^(n, t) is no larger than ^ 
theorem giving a measure of transcendence of a number f rep 
a refinement of the assertion that f is transcendental ; such me 
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have been given for certain nuinl)ers. In Section 5 5 we shall deter 

mine a measure of transc-emlenee for e. 

Mahler has elaborated on the theory of transcendence measure m 

the following way. Let £ he a complex number, and put 


,.(-V, 2) = oiJ.X) = min 


L .rA.2'-' 

k =0 


( 10 ) 


where the luiniinuiu is (‘xt ended o\'er all those sets of rational integial 
coefficients .m, - - . , x,, of heights at must X for which 


£ ^ 0 . 

k =0 


Then con 

(X) 

is at most 1, 

and is 

a nonincreasing 

function of both 

X 

and n. 

Put 








C0„(X) 

= X- 

Pri(A') 

(1 

1) 

so that 




(1 co„(-V)) 





P«(A') 

log 






log X 



and let 









co„ (2) 

= COn 

= lim sup p„ (-V) 

> 



A'— « 


/ X 1* 

o){z) = CO = lim sup — • 

n — ► 00 

Each of (x>n and co is either + oo or a non-negative number. If is 
infinite and n' > n, then is also infinite; hence there is an index 
fx{z) = /Li, which may be finite or infinite, such that con is finite for 
n < /X and infinite for n > ijl- The two C|uantities w, g are never 
finite simultaneously, for the finiteness of m iniplies that there is an 
n < oo such that = oo , whence co = oo. The number 2 is called 

an A-number, if co = 0, ^ = 00 , 

an S-number, if0<co<oo^ ^=co, 

a T-number, if oj = 00 , ^ = 00 , 

a ^/-number, if co = «>, /x < <». 

If fjL is finite, then there is a fixed integer n such that for every 
tr > 0 there are integers tq, . . . , such that 

Ixo + Xi2 -b - • • + x„2"l < 
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For the case n = I, this is exactly the definition of the Liouville 
numbers, so that the ?7-numbers may be regarded as higher degree 
analogs of Liouville numbers. The author has shown that there are 
[/-numbers of every degree. 

If z is algebraic, the inequality (7) shows that pn(X), and hence 
also co„, remains bounded as oo, so that to = 0 and z is an 

^-number. If, on the other hand, z is transcendental, it follows from 
Theorem 5-5 that pn ^ — 1), whence oj > Thus the A~ 

numbers are precisely the algebraic numbers. 

The existence of T-numbers has never been proved. 


Theorem 5-6. If the complex numbers z and w are algebraically 
dependent^ that fs, if there is a polynomial F(x, y) with coefficients in 
Z such that F{z, w) = 0, then they belong to the same class. 


Proof: If z is algebraic and w is algebraically dependent on Zj 
then w is clearly also algebraic. We may therefore suppose that z 
and w are transcendental. 

N 

Let F(x,y) = YL Y. ahk^y'', 

h=0 k=0 

and suppose that F is irreducible. (One consequence of this assump- 
tion is that no polynomial in x alone is a factor of F.) Write 

A/ 

F{x, y) = Y 

h =0 


N 

where Ak(y) = H ahky^- 

k^O 


We may suppose that AAfiv) not identically zero. 

Let A(x) = ao H b be a polynomial for which the 

minimum is achieved in the definition (10) of that in 

particular max (la*!) < X. We shall obtain inequalities rearing 
o){z) and a)(tp); since in the definition of these quantities the first 
limit is taken on X, we temporarily regard n as fixed and X as a 


parameter. . i n/ ^ 

Since it is not the case that for each fixed y the polynomials F{Xjy) 

and A (x) have a common zero, we know by a standard theorem* that 


*See, for example, B. L. van der Waerden, Modern Algebra (English 
edition, translated by Fred Blum rom the second revised German e Jt.on), 
New York: Frederick Ungar Publishing Co., 1949, Vol. 1, pp. 83 85. 
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• « • 



0 Qo • • • 


Ao{y) • • • 


0 

Gfi 


■ • • 

^M(y) 



rows 


rows 


is not identically zero. K(y) is a polynomial m y of degree nN at 
most, with coefficients in Z. Since f is a fixed polynomial through- 
out, the coefficients in R(y) do not exceed ciX , where ci is a con- 
stant depending only on n and F. . , , • 

If for each I with 2 < I < M + n, the Ith column in the determi- 
nant for Riy) is multiplied by and added to the first column, the 

new first column is 


A(x), x^(x), . . . , x^-^A(x), F{x,y), xF{x, y), .... x” ^Fix, y). 
Expanding by minors of the new first column, we obtain an identity 


R{y) = A{x)gix, y) -f F(x, y)h{x, y), 


from which 

R{w) = A{z)g{z, w). 

Regarding g(Xf y) as a sum of minors, we see that its coefficients are 
rational integers not exceeding C2X^~^ in absolute value, so that 


Hence 

But 


\g{z, w)l < 

14 ( 2)1 > C3-^X-^+^\R{w)\. 

IF (ill) 1 > UnNiCiX^, W), 


so 


14 ( 2)1 > C3-^X-^+^o,„^{ciX^,w). 
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It follows from the definition of A{x) that 

CO„(X,0) > 


log ( 1 /< 0 „(X, z)) 


logX 


- < ikf - 1 4- MoirxNiw), 


C0„(3) 


n 


and so we obtain 

o)n{z) = lim sup 

X 

o}(z) = lim sup 

n 

(M — 1)A^ + MNo^tinM / Tirnr r ^ 

< limsup-^^ ^ -^r; < MNo){w), 

uN 

and 

ju(w) < Nix{z). 

By symmetry, 

o}{w) < MNo){z) and y.{z) < My,{w), 

Thus oi{z) and oi{w) are simultaneously finite or infinite, as are niz) 
and tJi(w ) ; hence z and w are in the same class. 

5-5 Arithmetic properties of the exponential function. In this 
section we shall prove a theorem due to Mahler which simultaneously 
shows that e is an *S-number (and therefore transcendental), gives a 
transcendence measure for e, and shows that tt is transcendental. 
The transcendence measure is not the most precise one known, but 

more exact results are more difficult to prove. 

We begin with an algebraic analog of Theorem 5-5. Let wi, . . . , Wm 
be distinct complex numbers (having no connection with the function 
to„( 2 ) of the preceding section), and let ri, . . . , be positive integers. 
Instead of asking for rational integers xo, • ■ • , for which the 
quantity xq + xi<oi + • • • + is numerically small, we shall 

investigate the polynomials 

Ak(z) = Akiz'i • * ■ > Wi, . . . , 0)m)f ^ - I, . , . iTn, 


of respective 
function 


degrees ri — L • . . , — 1 at most, 

R(z) = R(z', ri, ... ,rm; o>i, ■■■ > 

= + h 


for which the 


( 12 ) 


is algebraically small, i.e., has a Maclaurin expan^on beginning with 
^ large power of 2 . The total number of coefficients among the 
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polynomials Ak{z) is r = ri + • • • + ; 

determined constants, then the conditions 

RiO) = 0, R'(0) =0, . . . , 


if they are taken as un- 

= 0 


yield a system of r — 1 linear homogeneous equations in these 
r unknowns. Such a system always has solutions distinct from 
(0, 0, . . . , 0). Let R{z) temporarily designate any of the functions 
obtained in this manner; thus R(z)j which is not identically zero, 
certainly has a zero of order r — 1 at 2 = 0, and could conceivably 
have one of higher order there. Suppose that the actual order is 
r I + E, so that R{z) has an expansion 

CO 

R{Z) = H Gr+E-l 5*^ 0- 

h=r+E-l 


The non-negative integer E is called the excess, and tn is called the 
order, of R{z). We first show that the excess is always equal to zero. 

At least one of the polynomials Ak{z) does not vanish identically, 
and with no loss in generality we may suppose it to be ^ 1 ( 2 ). It is 
easily proved by induction that if Z) = d/dz, 

D^e^^Aiz) = e^^{D a))“A(2) (13) 


for every positive integer a and every function A ( 2 ) with sufficiently 
many derivatives. Moreover, if A ( 2 ) is a polynomial which is not 
identically zero, and w 5 ^ 0, then (Z) -f- (a))^A(z) is a polynomial of 
the same degree as A{z). Hence 

= + • • • + + Am{z)) 

R’ (^z ,rif.t,, 7 *^ 7 ^— 1 1^1 , W771. — 1 WfTi) » 


where Ai* is not identically zero and, as implied by the notation, 
deg Ak* < rfc — 1 for = 1, . . . , w — 1. Clearly 

R^^^ (0; , Tm-l ; COi — C0,„, . . . , OJm-l “ ^m) = 0 


for p = 0, 1, . . . , r + — 1, so that from an i2-function of 

order m and excess E we have obtained another of order m — I and 
excess E. Repeating the process, we come finally to a function 
Ri{z) = R{z;ri]u}) = A( 2 )e“^ of order 1 and excess E, But if 
X{z) = So + Siz + ■ • • -h Sri_i 2 ’'i‘~\ the conditions i?i(0) = • • • = 

= 0 give So = * • ■ = ^ri ~2 = 0) so that there is certainly 
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no such function which does not vanish identically, if E > 0. Hence 
^(r— 1 ) ^ Qj. equivalently, the coefficient of 2 ^“^ in the Maclaurin 

expansion oi R{z) is not zero, while all preceding coefficients are zero. 
Introducing an appropriate numerical factor, we can put 

7 TTi + • ■ • . 

(r - 1) ! 

The function R and the coefficients • • - are now uniquely 

determined, since if there were two such functions for given 
coi, . . . , ojm, ri, . . . , r;ri, their difference would have positive excess. 
Moreover, while we have so far known only that not all of the poly- 
nomials yli( 2 ), . . . ,Am{z) are identically zero, we now see that in 
fact they are of exact degrees n — 1, . . . , — 1, respectively, 

since otherwise we could have begun with lower degree polynomials 
and arrived at a function of positive excess. Finally, we see that 
R{z) is symmetric in the pairs of arguments 
since the pairs can be permuted while the solution (subject to all the 
imposed conditions) is unique. This can also be seen by noting that 
R{z) is the unique solution of the homogeneous linear differential 

equation 

(D - coi)’-* ••■(£)- oymY’^y = 0 


for which R{Q) = • • ■ = = 0 and /2 ^ *’(0) = 1, and the 

factors in the differential operator may be permuted at will. 

We now obtain explicit expressions for R{z) and the Ak{z)- Clearly 

-ri— 1 


R(z;ri; ui) = 


(r, - 1)! 




y 


(14) 


since this function has all the requisite properties and there is only 

one such function. Suppose that R(z‘f ri, . * • , i ; wi, . • • > i) 
has already been determined. Then if J is the operator 



we have by (13) that 

{D — oiiY^ ■ ■ ■ (D — 1 “i, • • • . w;,_i))l 
= (£>-a)i)’'‘ • ■ • (D- 0)^-1 )’■"“* 

. . . “/i-i)) 

= (D-oji)’'!- • • (D-o>^-iY‘‘-'R{z;ri, . . - ,rp-i;‘»i, • ■ • , w^-i)==0, 
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and since R{z]r\. . . . , r^_i ; coi, . . . , w^—i) has 
n + • ■ • + — 1 at 0, the function 

ru ... y r^_i; wi, . . . 


a zero 
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has a zero of order ri + • • • + - 1 at 0, and it clearly has leading 

coefficient ((ri + • • ■ + — l)!)~h Hence 

R (^z j fij . . . . . . , ) 

= riy ... y ]0)u ... y co^-i))* (1^) 


and consequently 



(ri-1)! 


We now use the standard formula 





(z - 1)^-' 

(a — 1) ! 


fit) dty 


which is easily verified by integration by parts. We have 


g(w2— (u^)z JT2 I g(wi— W2)z 


n-i 


{r, - 1)! 


— ^(w2— 





1 (z /iV 2 — 1 


0 (ri — 1) ! (^2 — 1) ! 


dti 



^ IV^-^ {z - „ 


0 (?-l — 1) ! (?'2 — 1) ! 


^Wltl+W2(2— q)— 


and by induction we see that 



Riz)= / dtm-\ 

0 fo 



^m— 1 


dtm.^ 


rs • • * 

m — 2 


• t 



<2 


(n-l) !(r 2 -l) ! • • • iTU^m- 1) ! 

X g“i<iW<2-<.) + -"+“-(^ 1 dij. (16) 


Before deducing an explicit formula for Akiz)^ we recall certain 
properties of inverse operators. The operator Z)“S as applied to an 
integral combination f(z) of polynomials and exponential functions, 
yields that antiderivative which contains no constant of integration. 
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Hence 


+ <p{z), 


where is a function annihilated by Z)^, that is, a polynomial of 
degree p — 1 at most. . More generally, if co 5 *^ 0 we define, by analogy 
to (13), 

{D - = e-£»-'’(e-“7(2)), 

SO that 

{D - + ^( 0 ), (17) 


where \p(z) is annihilated by (D — a?)'*; that is, it is e"* times a 
polynomial of degree p — 1 at most. Since 


(D - 0 ,) 


/ nz 

r + - 


+ 


n(n — 1)2 


n— 2 


CO 




and since no term of the operand is annihilated by i) — w, we can 


write 


n 


(D - a))-'2" = - L 


n ! 





More generally, it can be shown that if F is any polynomial of degree 
n for which FiO) ^ 0, then (F{D))-h’' can be written as 

(flo "b diD + • • • + , (18) 

where ao + • • • + flnw" is the Maclaurin expansion of (F(u))~^ to 
n + 1 terms.* 

We can now prove that for k = 1, . . . , w, 


Ak(z) 


( ^ 

n (z) + o)* — ioh) 


h^i 

h9^k 




For m = 1, the empty product is interpreted as the identity operator, 
of course, and in this case the correctness of (19) follows from equa- 
tion (14). Suppose that it is correct for all polynomials 

* * * » — 1 > • * • » 


*A more complete discussion of inverse operators is given in E. L. Ince, 
Ordinary Differential Equations, New York; Longmans, Green & Co., 
Inc., 1926; reprinted by Dover Publications, New York, 1944; pp. 138 - 140 . 
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with fx — I pairs cok. Then, by (15) and (17), 


R{z; ri, . . . , r^; 0)1, . . . , o)^) 

M —1 

= Yi Ak{z;ru . • • ,r^-i;o)i, . . 

/I —1 

A.- =1 



(u>A— u) 



XAfc(2; ri, . . . , ;oji, . . . , a)^_i)+pfc(2)) 
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ft —1 

= E e"'-‘^(Z)-a)^+a)fc)“''^4A-(z;r,, . . . , r^_i ;a)i, . . 
A =1 


, 0)^_l) + P(2)6‘"^^ 


where Pa:( 2) and P{z) are polynomials of degree — 1 at most. It 
follows that (19) is correct for k — 1, for arbitrary m, and its truth 
for k ~ 2 follows from the previously noted symmetry of R{z) in 
the pairs co*, 

For fixed complex numbers o)i, . . . , o)^, our considerations up to 
this point are valid for all the functions P( 2 ; ri, . . . , wi, . . . , o)^) 
corresponding to arbitrary sets ri, . . . , positive integers. 

We now specialize the parameters so as to obtain a collection of 
functions depending on a single parameter p. 

For h and k in the sequence 1, 2, . . . , m, define 


and put 



if /i = k, 
if /i ^ k, 


Rhi.^^ Rhi.^ j P j ^1) • * • j ^m') R ) P“l“5i^, . . . , p “1 “ ^mh j » • • • » ^*^m) j 

( 2 ) = A/ifc (2 , p , 0)1, . . . , 0);n) ~ A lc{Zj p~\-Sih, . • • I P'f'^mh J • • • j t*^m)* 


Here p is a fixed but arbitrary positive integer. We form the square 
matrix 


A{z) = {Ahk(z)), h,k ^ I, . . . ,m, 


having determinant D{z). Let the minor determinant of Ahk{z) in 
D(z) be Dhkiz). 

Now Ahk(z) is a polynomial in z of degree p + — 1, and the 

coefficient of the highest power of z is, by (19), 


1 

(P + Shk - 1)! 


n (ci - < 0 ,)-'^**' 




l9ik 
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Hence in the expansion of D{z)^ the term formed from the elements of 
the main diagonal will be of higher degree than any other term, and 
D{z) is therefore a polynomial of degree mp with the coefficient of the 
highest power of z equal to 

1 mm 

— ^ n n (co«: - CO,)-" . 

(p !) k=i / =1 

i^k 

If, on the other hand, we solve the system of equations 

m 

E = Rhiz), h = 1, . . . ,m, 

k=l 


for we obtain the identity 

m 

£»(2)e“‘" = L {-l)''+'‘Dhkiz)Rhiz). 

k=l 

Since the expansion of Rhi^) begins with the term z”^^f (Tnp)\j the 
polynomial D{z) is divisible by z^^. Hence 

-mp tn m 

£»(2) = n n (a.* - (20) 

(p!) 1^1 

l ^k 


and D{z) vanishes only at 2 = 0. 

Let Cl, C 2 , - . . be positive constants depending only on w, oji, . . . , Wm- 
(In particular, they must not depend on p, which will eventually be 

large. ) 

Examination of equation (16) shows that for 1 < A < wi. 


RhW 




From (19), we obtain 


Ahk(z) = 


1/5* 


I 


(p + ^A*"” 1) ^ 


where the sums need not be extended past the index p. Let 


n = n (wjfe — oja)* 

*.**1 

h<k 

so that can be regarded as a polynomial of total degree m(7n — l)/2 
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in coi, . - . , with coefficients in Z not exceeding 2"'^ in absolute 
value. Since no exponent p + ^hi + hi the above sums exceeds 
2p + 1, the expression 

au- = 


is a polynomial in wi, . 


. , o),n, of total degree 



{2p + 1 ) 


at most, whose coefficients are rational integers of the order of 
magnitude 0(c2^p!). Finally, we put 


so that 


rn = 


m 

E 

A' = l 






The quantities ri, . . . , are linear forms in the numbers and 
they are linearly independent, in the sense that no linear combination 
of the vectors (a^i, . . . , a^m), for \ < h < m, is the zero vector. 
This is equivalent to the assertion that />(1) 0, which follows 

from (20). 

Theorem 5-7. Suppose that coi, . . . , co„i all lie in an algebraic 
number field K of degree g^ and let 

m 

Lh = H bhke'^^'y h = 1, . . . , M, 


be p. independent linear forms in , e"'” with coefficients bhk i^^ 

Suppose that 


m 



< M < m, 



and put 


b = max s = max (\Lh\)‘ 


Then to each e > 0 there corresponds a boi^) such that s > b ’’ ^ if 
b > bo{e), where 

mpg 

pg - m{g - 1 ) 


T 


- 1 . 
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Proof: By a well-known theorem on independence,* the m forms 
Li, . . . , together with m — y. oi the forms r*, which we may 
designate by Vh^, are independent. Hence the determinant 

Ct/ijl . . • 

?>1 1 ... blm 

• • 

4 

4 

6^1 . . - ^nm 

is not zero; it is obviously a polynomial in wj, . . . , of degree at 
most 

m{m — l)(2p + l)(w ” m) 

2 ’ 

with coefficients in Z of the order of magnitude 0(c4'’p!”' It 

follows, first, that there is a rational integer C5 such that C5 A is an 

integer of K, and, second, that 

ra = 0 ( (<7 + 1 )'"C4'’P 

= o(cj^pr->‘bn, 

where Cg is an upper bound for the various numbers IWI- Hence if 
A, a", . . . , are the field conjugates of A, we have 


1 


C6'’(C5'’A") • • • (C5'’A^‘'') 

A 

1 

Nfcs^A) 


Moreover, using subscripts on A to indicate minors, we have 

Aifc = Oicg>‘pr-'^~^b^)> = O(cio'’pr''‘h^), 

for 1 < i < w — p, 

Aik = 0(cii'’pr"'‘f>''''^). AikLi-m+p = 0(ci2'’p!'"”'‘^''‘ ‘s)> 

for m — M+1 


* 7 ^ T van der Waerden, Modern Algebra (English edifion, ^ans- 
latef by Fred Blum from the second revised German edition). New 
Frederick Ungar Publishing Co., 1949 , Vol. 1 , P- 10 • 
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5-5) the exponential function 

Using the identity 

m —fi Tn 

= X (- 1 + Z 

Z=m-M +1 


it follows that 


or 


1 = 


C^^Pp\}rn-~^)9ly^g-l^ > C15 “ C16 * , 


( 22 ) 


From the inequality ( 21 ), the exponent 7n(^ — 1) — is negative. 
Hence the quantity 



(M( 7 — m(ff— 1 ) 


may increase for small values of p, but it tends to zero as p increases 
indefinitely. At any rate, we can say that for h larger than some C17, 
the smallest value of p for which 


2 

is so large that Ci3^ is negligible as compared to the factorial, and for 
such p we have the asymptotic relation 


logp! 




t^g — m{g — 1 ) 


log b. 


By ( 22 ), for b > 60(e), 


s > b 


■~e 


where 


= pg - 1 + 


(m — 

pg - m{g - 1) 


mpg 


y-9 


- m{g - 1) 


- 1 


This proves the theorem. 


Theorem 5-8. Suppose that t?i, . . , ^ are elements of an alge- 
braic number field K of degree g, and that they are linearly inde^ 
pendent over the rationals, so that no relation of the form 

di^i + • • • + = 0 

, dif rational and not all zero. Then if the coefil- 


holds urith di, . . . 
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dents the linear form 


L = E 

Xi =0 


^fN 
'Kn =0 




Ml H \-><N^x 


• • 

• U 


• • 


(23) 


in the quantities e 


Xn?i -f-*'* 


are rational integers with 


b = max (|?>xi...XArl)j 


there is a constant T, depending only on g and N, such that for suffi- 
denily large 6, 

\L\ > 

Proof : Let • • • . Miv be positive integers, and consider the 
quantities 






^lV"lN — ^ 

Iy — 


/at = 0, 1, , MA, (24) 


their number being 


= (^1 + 1) • ■ • (ma + !)• 


If we introduce the exponential factor in (24) inside the summation 
in (23); we see that the various t>e regarded as linear 

forms in the quantities 

CJ\ x 


where 


0>\V"^N ~ ^1*^1 + • ■ * + ^A*^A, 


Xat = 0, 1, . . . , Mn + MA, 


Xl ” Oj . . . j il/ 1 "t" Ml j • ‘ * y 

the number of co’s being 

fn = (Ml + Ml + 1) * * • (-^A + MA + !)• 

The numbers are distinct on account of the independence of 

,9 over the rationals, so that we can speak of the indepena- 

ei'ee'of'the forms L,....,,- To see that as a matte, ot 'aet they am 

independent, order the subscript sets X, . . ■ and h ■ ■ 
preting the X’s and I’s as digits in the base q, for some suffi^ently 
large a Then there cannot be a linear relation among the coefficient 
vectors of any set of forms, since the with argest subsenP 

occurs only in the form Li^.-In with largest subscript. FmaUy, t 
are positive constants « and 0 which are independent of the ooeffi- 
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cients 6xi...x,V) foi’ which 


a < 


L 


i-j /i • • • 




< 0 


It follows from Theorem 5-7 that if 


fti 


- n + 1 ^ 


0 


g - 1 


(25) 


then for b > 5(e), 


where 


\L\ > 5- 


— t 


T = 


g(Mi+fii + l) • • • (M]si+^N + l)(fjL\ + l) ■ • • (MiV+1) 

^(mi + I ) • ■ ■ (mA^+I) — ((7“ l)(^f’l+Ml + l) ■ ■ • (il/A'+MA + l) 


1 


Condition (25) is satisfied if 


Ml = 




Mi 


2g 


1/N 


LV2^ - 1 


- 1 


since then 


Mi 

1 + 




l/N 


Ml + 1 \2^ — 1 


N 


n ( 1 + — ^ ^ 

“ V M.- + 1/ ~ - 1 g - I 


With this choice of fii we have 


N 

n ( 1 + 

t =1 


Mi \ 

Ml + 1/ 


29 


r =■ 


1 - 


g 


g 


-n( 

i=l \ 


1 + 


Mi \ 

u^i + 1/ 


- 1 <M 


2g - 1 


1 - 


? - 1 2? 


-1 




•*4 


g 2ff - 1 

= 2^m - 1- (26) 


Since g ,> 1 and N > Ij we have /x,- > Mi > 1. Since [x] + 1 < 2x 
f or X >1^ we have 

N 

M < n 


2Mi 


i=i ( 2g y/w _ ’ 

V2i/ - d 
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Taking N = I, we have 

Corollary 1. If d 9 ^ 0 is algebraicy is an S~numbery and in 
particular is transcendental. 

For t? = Tviy = — 1 is not transcendental. Hence 
Corollary 2. tt is transcendental. 

We also have the following result, first proved by F. Lindemann 
in 1882. 


Corollary 3. // t?i, . . . , are algebraic and are linearly inde- 

pendent over the rationalsy then e'^\ . . . , are algebraically inde- 
pendent over the field of algebraic numbers, that is, there is no poly- 
nomial P{zi, . . . , zn) urith algebraic coefficients not all zero for which 

P{e^\ ... , = 0 . 

Finally, for A'’ = 1 the brackets can be omitted in the definition of 
jtii; then = {2g — \)Mi and 

T <2f7M- 1 <2g{{2g- l)Mi + l)-l = 2g{2g - l)Mx+2g- 1. 
Corollary 4. If ^ 9 ^ 0 is algebraic of degree g, then the function 

is a transcendence measure for e^. 


5-6 A theorem of Schneider. In addition to the Liouville numbers 
and values of the exponential function^ many other specific numbers 
are known to be transcendental. To indicate the type of results 

known, we mention the following: 

(a) The Bessel functions Jq{z) and /o'(^) are transcendental for 

algebraic x 9 ^ 0 . „ 

(b) If a and ^ are algebraic, a 0 or 1, and P is irrationa’ then or 

is transcendental. (In particular, e' = (—1) * is included.) 
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(c) At least one of the numbers </ 2 , 173, “i . ‘^2 associated with a 
Weierstrass ^-function is transcendental, and if go and gs are alge- 
braic, at least one of z and ^(z) is transcendental. 

(d) If fix) is a polynomial whose value is in Z for argument in Z, 

and fix) > 0 for X > 0, then the number 

0./(l)/(2)/(3) . . . , 


formed by juxtaposing the decimal representations of the values /(x), 
is transcendental. (An example is the number 0.1361015..., 

generated by/(x) = (.r“ -b x)/2.) 

(e) If a, is a positive quadratic irrationality, then the number 


n =0 


is transcendental for algebraic 2 5 *^ 0. 

On the other hand, it is not known whether the following numbers 
are transcendental: 


(a) 7 = 

n— » « 




+ - — log n ) » 
n 


• 1 

(b) f(2n + 1) = n . 2 n+i » 

A:=l ^ 

(c) r(x) for algebraic x not in Z, 

(d) e -|- TT, eir. 

The methods used to prove what little is known about specific 
transcendental numbers show considerable variety, both in technique 
and conception. T. Schneider has recently shown, however, that 
several results which earlier required separate proofs can all be ob- 
tained from a single theorem. This theorem says nothing directly 
about transcendental numbers; rather, its sense is that if several 
transcendental functions assume algebraic values at a large number 
of points, then they must either have large rates of growth or be 
algebraically dependent (as functions). The prototype of Schneider's 
result, proved by G. Polya in 1920, asserts that if / is an integral 
transcendental function which assumes values in Z for 2 = 0, 1, 2, , 


then 


.i 


lim sup 

r— > CO 
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where, as usual, 

M{r) = max (|/( 2 )|). 

\Z\ =T 

There have been many refinements and extensions of P61ya's work, 
of course; we mention only that by A. Gelfond in 1929, where this 
kind of theorem was first used for transcendence investigations. (His 
result was that oP is transcendental i6v algebraic a 0, 1, if j0 is an 
imaginary quadratic irrationality.) 

In this section we shall prove Schneider’s theorem, and in the next 
we shall apply it to the numbers ofi. (The facts mentioned above 
concerning the g>-function can also be deduced, but the requisite 
preliminaries preclude doing so here.) Since the statement of the 
theorem is complicated, we first introduce some notation. 

By the order of an entire function f{z) we mean, as usual, the 

quantity 

\og\ogM{R) 

hm sup ; ; 

\ogR 


\i f{z) is of order then 


/(2) = 


as l^l = oo^ for every fixed € > 0. Let fi, • • • be an infinite 
sequence of complex numbers. Designate by 

Zo(m) = 2o, . • . , Zk(m) = Zk 

the distinct numbers among fi, . . . , fmi by Z,((w) + 1 — L + 1 
the multiplicity of occurrence of z* among fi, . . . , fm* Thus 


E (^« + 1) = m. 

x-O 


(27) 


Let r{m) = r be the radius of the smallest circle about the origin 
which contains zi, . . . , and put 

a=liminf^^, ' V (38) 

« log r 


m— > oo 


so that Of < oo . Let 


I == max (io, • • • > h)’ 


Finally, let A: be a fixed algebraic number field of degree g, and, as 
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always, let ITI be the maximum of the absolute values of the con- 
jugates of a, for a in K, 


Theorem 5-9. Letfi{z), . . . ,fn{z) meromorphic functions with 
the property that for each m, the numbers 



are in K. Let Hv{z^) he positive rational integers such that all the 
numbers 

H y(Zx')f (^x)) X = 0, . . . , Ixf ^ 0, . . . , A', V 1, , , . , 71, 


are integers in K, Suppose that 




For each v, if fy{z) is entire let it be of order fiy, and otherwise suppose 
that there is an entire function Gy{z) of order fXy such that Gy{z)fyiz) is 
entire and also of order py. Suppose that 


and put 




Suppose finally that 


lim sup 


m 


00 


log log max (|G.(z*)l 

0<x<ib 

log m 




< V 


V} 


V = 1, , . . , 71, (31) 


and 


lim sup 


log log max 

0<x<fc 

o<x</* 




m 


00 


log m 


< ■nv, 



Then f I, are algebraically dependent over K. 

Proof: We form a polynomial 


4-(2) = Z 

n *0 


fn 

rn=0 ‘ 
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and seek to determine the coefficients so that ^ has a zero of 

order + I at ior h = 1, . . , , /c. Here the numbers A:, z^, and 
are all defined in terms of the sequence fi, • • • and an index w, as 
explained earlier; m is fixed, and will be specified more exactly later. 
The conditions imposed on ^ require that all the numbers 

fe), X = 0, . . . , ; « = 0, . . . , A:, 

shall vanish, and this in turn yields a set of m homogeneous linear 
equations of the form 

L 0)^Cry..rn = 0; ^ = I,,.. yTTl, (33) 

r 

in the « = (<i + 1) ••(<-. + 1) unknowns Cryr„- (Of course, the 

numbers also depend on ti, . . . , t„.) We put 

t, = [(2m^+’"+ " = 1> • ■ • - (34) 


SO that 

t = (ii + 1) * * * + 1) ^ 

The coefficients in equations (33) are by assumption numbers in 
K, and after multiplication by the rational integral factor 


f n 11/n 

]n I = 2m. (35) 

[^ = 1 


n (//,(2j)'' 

they become integers, say of K. The size of the coefficients n, is 
determined in part by this numerical factor, in part by the valu^ ol 
the /„ and their derivatives at the various points z„ and in part by 
the numerical coefficients introduced by differentiation. tbe ^ti- 
mate (36) below, the second of these is accounted for by (32). t he 
third depends only on the set of exponents h, . . . , in and t e or er 
of the derivative considered, and so can be computed from the lact 
that the sum of all the coefficients in the expansion of 





by the product formula is 



— X + 
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We thus obtain the bound 

r^Tl < n {H.iz,))'" • n exp ■ ( Z u.) - (3(')) 

V = 1 l/ = l 

where e, > 0 and > 0 as ^ - (Hereafter we designate any 

quantity with the latter properties by c, and any positive integer 

independent of m, and r by 7 .) 

It follows from the inequality (30) and the definition of 77 ^ that 

T?1 + ■ • * + ^ (37) 


so that, by (34), 




Using this, together with (32) and (36), we obtain 


n 


Tg < (7m)'expj2 Z 


By (29), < y^. By (37), vi + - ■ + Vn = n — I - 6 with 5 a 

positive constant; hence 

1 ^ n — 6 5 

- (1 + ^1 + • • * + Vn) + € = [-€=1 he. 

n n n 


We henceforth require m to be so large that 

5 

€ < - • 
n 



Then 




Using this and (35), we shall now show that there are coefficients 
Crym satisfying (33) which are integers in JC, are not all zero^ and are 

such that 

\Cry..rJ < 7 ”*. (40) 

To simplify the notation, arrange the some fixed linear 

order, and rewrite (33) in the form 

t 

^ QififCr 0 , 71 = 1 , , rn. 

T=l 


Let Pi, . . ■ , Pfl be an integral basis for K, let he Z be positive, and 
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let B be the set of integers in K of the form 

^iPi + • * * + bgpQi 

where the b's range independently over the rational integers such that 
|5| < h. For each set of elements Xi, . . . ,Xt oi B, put 

^ ^fijXi-j fx 1, . . . , 7H. 

T = 1 

This defines (2h + l)gt 7n-tuples yiy ... ,ym, not necessarily different 
from one another. Also, since 

^ < yh, (41) 


we have from (39) that 

< y”‘h. 



Each number has a basis representation Cipi + • • • + CgPg; 
similar representations, with the same Ci and with the p,- replaced by 
their conjugates, hold for the conjugates of The determinant 
formed from the p,- and their conjugates is not zero, so that it is 
possible to solve the g equations defining y^ and its conjugates for the 
numbers c„ giving each c, as a linear expression in the conjugates of 
yg, with coefficients depending only on K. From (42) it follows that 



c,| < y^h. 


There are, however, exactly {2y^h +1)^ different integers 
whose basis representation satisfies (43); therefore there are at 
{2y^h + 1)*''” different systems y\, ... j ym- If 

{2y^h + 1 )^ < {2h + iy\ 


(43) 

of K 
most 



then two systems yi, - • ■ . 2/m corresponding to two different sets 
Xi . . , Xt coincide, and the respective differences Xi — Xi y . . . , 
Xt- X/ constitute a solution of (33). These differences, which we 
call Cl, , Ciy are not all zero, and by (41), they satisfy the 

condition 

HC^ < yh. 


By (35), 2 > 2m, so (44) holds if 

(2y”'h + !)"*< (2h+ 1)^”, 

which is clearly true if h = y”'- But then (40) holds. 
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We now designate by mo a fixed value of m such that (38) holds, 
and by Aq and the corresponding values of k and and define <J> 
to be a fixed function corresponding to mo and having all the proper- 
ties described up to this point. We are now able to perform an 
induction. 

We know that 4> possesses mo zeros, if each is counted with its 
proper multiplicity. It is asserted that if mo is sufficiently large, then 
^{z) vanishes at all the points fi, This is proved inductively 

by showing that if 4>(2) =0 for 2 = fi, . . . , with m > mo, then 

also = 0, if mo is sufficiently large. More precisely, we 

assume that has a zero at 2,,(w = 0, . . . , A') order /,, + 1, with 

k 

E (L + ^) = 

X =0 

and shall deduce that ^ has a zero at f of order A + 1, where 

_ I 0 if r = 2*4.1 2,, for « = 0, . . . , A, 

+ 1 if r = Zcr and 0 < (7 < k. 

Here A = A(m) and I,, = /^(m). 

Put 

G(z) = n(?/-'(2); 


t 


then (?(2)4>(2) is an entire function which vanishes at the same points 
2 „ as <J>( 2 ), and to the same order, by (31). We also put 

Q{z) = n (2 - 2«)'’‘+*. 

x=0 


Cauchy ^s theorem, 


2x9^r 


{G {zm^) / Q{z)) 
dz'^ 


Aj_ r G(z)<t>{z) 
2in Jr Q{z) 


dz 

(2 - ' 



Here V is the circle 

|2| = J?i = ,? > 1, 

where 72=r(m + l) ifQ:< <» (we recall that r(m) was defined 
earlier as the radius of the smallest circle about 0 containing 
2o) • • • j while if a = 00 , is so chosen that 



log mffi- 1 
logfi 



lim R = 


QO 


R > r{m + 1), 


m— ► « 


m 


00 
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Since ^ has a zero of order A at the left side of (45) is simply 


so that 





(z) 

Q{z) 

*=r 

Q(n All f 

G(z)^(z) 

G(f) 27r ir 

Q(Z) 


dz 


(z - 


We shall use this representation to estimate 
equality (40), we have 


(46) 


By the in- 




c 


where e — > 0 as ^ , or equivalently as m ^ <» . By the defini 

tion (28) of O', 


or 


Rx < 


even for a = Hence 


max \G{z)^{z)\ < t”*** exp 





since it is easily seen from the definition (34) that t < y 
(34) we also have that 

p=i 

n 


’”0 From 


V=1 

n 


E 2m' 

K=i 




We may suppose, with no loss in generality, that each i/k < 1* 
For suppose that Vn, say, is larger than 1. Then in analogy with (30), 


Ml + • • ’ + Mn— : 

n — 2 


< 


and all hypotheses of the theorem are satisfied by the n - 1 functions 
/i(z), . . . But if fiiz), . . . ,/n-i(2) can be shown to be 
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algebraically dependent, then/i( 2 ), . . . ,/„( 2 )must also be dependent 
Consequently, if we put i? = 1 + 6/2«, we have 

6 

(t? — 1) max {t]^) < — • 

l<K<n 

If mo, and hence also m > mo, is so large that 


2n 


(47) 


7i 


then 


Yi "+‘'’-1)”-+' < nm, 


= 1 


and 


max \G( 2 )^{z)\ < < 7 " 

\z\=Ri 


(48) 


« f - - A 

Continuing the estimation of the right side of (4G), we notice that 
since > 1, and since R grows indefinitely with m, it is possible to 
choose mo so large that 

min \z — z^\ > ^ > (49) 

M=/?i 2 


for X = 0, . . . , fc, and also 


• 1 M 

mm I 2 - r| > — 

\z\=Ri ^ 


(50) 




In that case, min \Q{z){z — 

\z\=Ri \2/ 

Since f and all the z, lie in the disk \z\ < /?, we have 

ir - 2x1 < 2R, 

|Q(f)l < (2/er. 

Finally, we see from (31) that 


(51) 


so that 


(52) 


|G(r)l > exp (- Y t 

\ *'*1 


,m 




> 7 


— m 


\ — 1 / 

Combining the relations (48), (51), (52), and (53), we have 

/ r> \ — m — 1 


< i2RT-y^A^- 


y 


m 



Ri 


m * A 


< 7"^A 





( 53 ) 
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Since by hypothesis 

I = 

the inequality 


max A) < 
o<*<* 


m + 1 
log (m + 1) 


< 7”* 


holds ; hence 





Recalling that #( 2 ) is a polynomial in fi(z), . . . ,/n( 2 ) with coeffi- 
cients Ct which are integers in K, and that all the derivatives of 
fi{z)j . . . ,/n(2) up to order A have values in K for z = f, we see 
that also is a number in and that the product 

*'=1 


is an integer in K, By the same reasoning as was used in producing 
the estimate (36), we have 


1^=-! 



n 


n Hj'a) ■ 

v = i 



• n exp ■ n (<. + 1)- 

v =1 »' 

The factor y”^^ comes from the estimate (40) for 1^1, while the last 
product is the total number of terms in ^(z) itself, which was an 
unnecessary factor in (36), where we were estimating the terms in a 
derivative arising from a single term in ^(z). By arguments used 
previously, it follows from the last inequality that 

n i/,''(r) l4><^>(f)l < 7”. 


Combining this with (54), we have 








and the upper bound here is smaller than 1 for m sufficiently large, 
say m > mi. Hence if Wq is so large that mo > mj, and the inequali- 
ties (47), (49), and (50) hold, then it follows from (55) that , 


(f ) = 0, 

% 


as asserted. 
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To complete the proof of Theorem 5-9, we shall make use of the 
following general considerations. Let <p{z) be analytic in the disk 
bounded by a circle T, and let xi, . . . , Xp be interior points of this 

disk. Then <p{z) has an expansion 

^(z)=ao + ai(2-Xi)+a2(z-Xi)(z-X2)+ ■ • ■ 

+ap_i(z — xi) • • ■ (2 — Xp_i) + (3 — xi) • • • {z — Xp)Rp(z) (56) 

with constants oq, . . • , Op-i and a function Rp{z) regular in the disk. 
In fact, if we put 

1 f At) 


2irijr {t - Xi) - At - ^Q+i) 


dt = Qnf 


(57) 


for $ = 0, . . . , P - 1, and 


1 


<P 


it) 


27 riJr {t - z)it - Xi) • • - - x^) 


dt = Rpiz)j 


(58) 


then 


ip{z) 


— (ao + ®i(^ — xi) + • • ‘ + cip-i{z — xi) ■ ■ • (2 a:p_i)) 



1 


2 — Xi 


« • ♦ 


t — Xi {t — Xi)(t — X2) 


(z - Xp_i) \ 

-Xp) 


(< — Xi) • • • (< 


(i)d( 


= -f — 

2 « Jr (t - 


(2 — Xi) • • • (z — Xp) 


z)(< — Xi) • • • (« — Xp) 


<f>{t)dt 


= (2 — xi) • • • (z — Xp)Rj,{z), 


and it is clear from (58) that Rp{z) is regular inside T. 

We apply this with ((>{z) = ^(z)G{z), the “interpolation points” 
xi, . . . , Xp being ^i, . . . , U, in this order. For T we choose the 
circle Izl = «i = K'’ withd > 1, where R > r{m), hm K = «>, and 


m— * 


. -logw 

lim ini : r- = a. 


m— ♦ «o 


log ft 


Since ^(2) vanishes at all the points fj, f2> • 
expression (57) for aq is regular in F, so that 

aq = 0 for 9 = 0, . . 


. , the integrand in the 


771 — 1 . 
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Hence for fixed z with 2 fi, • • • » 


G{z)^{z) = lim [Qm{z) ' 


G{t)^{t) dt 


m — ► 00 


2iriJv Qm(t) t-Z 


where 


Qm{z) = IT (Z - 


E (Z. + 1) = m 


= 0 


=0 


As in the derivation of (54), we have 


n 


n 


\G{z)^{z)\ < t”'" n (<v + 1) exp y E (2r) 


m 



— m 


< 7 


m 


/ 7 ? \”* 

(— ) = 

V«l/ 


Since this inequality holds for arbitrarily large m, and since R in 
creases indefinitely with m while y does not, it must be that 

G{z)^{z) = 0. 

Hence $( 2 ) vanishes for all 2 , which is the assertion of the theorem. 


S_7 The Hilbert-Gelfond-Schneider theorem. As an application 
of Schneider’s theorem of the preceding section, we now prove 

Theorem 5-10. If a and b are algebraic numbers, b is irrational, 
and a is neither 0 nor 1 , then a* is transcendental. 

This theorem settles a question raised by Euler concerning the 
arithmetic nature of the logarithm of a rational number to a rational 
base, and repeated in more general form in the seventh of Hilbert s 
famous list of 23 outstanding problems which seemed to him to be 
both difficult and important. The list appeared in 1900, but it was 
not until 1929 that Gelfond made the first contribution to the solution 
of this problem. Further partial results were obtained by Kusmin, 
Siegel, and Boehle, and in 1934 complete proofs were given almos 
simultaneously by Gelfond and Schneider. As 

proof to be given now is most nearly in the spirit of Gelfond s IW 
paper; it should be instructive to the reader also to .examine the 

original complete proofs by Gelfond and Schneider. 

We apply Theorem 5-9 with n = 2, /i(2) = 0 * 1 / 2 ( 2 ’) — *> 

= M + vb, where u and v range over the positive integers. n 
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account of the irrationality of b, the numbers distinct, they 

are to be ordered by the size of u v, and otherwise arbitrarily. 
Suppose that in the sequence fi, • • • , U, all the numbers occur for 
which u + V < d, and possibly some (but not all) of those for which 

u + V = d + 1. Then clearly 

did - 1) . .d{d+ 1) 

— ^ < m < r » 

2 - 2 

while 

r — max {\u + vb\) < (d 1)(1 + \b\)y 
and (taking u = d — v = ±1), 

r ^ d — 1 — l^j. 

These inequalities show that 71 ?*^ < m < 72 ^^) some positive 71 
and 72 - Thus 

log 7n 

a = lim = 2. 

« log r 

By the choice of fiiz) and / 2 ( 2 ), Mi = ^ and m 2 = 0, and the 
inequality (30) holds. Since and 2 are entire, (31 ) is without force. 
If we suppose that 2 and a" are elements of an algebraic number field 
K for 2 = b, then / 2 (f) = u + vb and /i(f) = a^ia^Y are also in K 
for positive integral u and v. (We need not examine the derivatives, 
since fi, ^ 2 ? ■ • • are distinct.) Moreover, if c is a positive rational 
integer such that ca, cb, and ca^ are integers of K, then we can choose 

and 7/2 (^x) = c 

for 2 ^ = u + vh. It follows from this and the definitions of /i ( 2 ) and 
/ 2 ( 2 )\hat the inequality (32) holds. Thus, under the assumption 
that a, b, and a*" are all algebraic, all hypotheses of Theorem 5-9 are 
satisfied, and it follows that 2 and are algebraically dependent. 
This being palpably false, the above assumption cannot be main- 
tained, and the theorem is proved. 

PROBLEM 

Show that is transcendental for algebraic t? 0. [Hint: Choose 
/i(z) = e*»/ 2 ( 2 ) = 2 , and 

for n = 1, 2, . . ..] 
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In this chapter and the next we shall consider various questions 
concerning the distribution of the rational primes. This is a large 
and difficult field, and we shall be able to obtain only a few of the 
iipportant results. The first of them, to which this chapter is de- 
voted, is Dirichlet’s famous theorem that there are infinitely many 
prii^e's of the form km + /, where k and I are fixed integers which are 

rei^vely prime. 

Ad 

# • 

Introduction. Although proofs of certain special cases of 
Dirichlet’s theorem are gi\'en in elementary texts,* the methods used 
\^annot be generalized to prove the full theorem, lo get an idea of 
■!%e method used by Dirichlet, let us consider the question of the 
.i^nitude of the set of primes of the form 4A: + 1. We base the dis- 
cussion on the Riemann ^-funclion, defined for s > 1 by the eciuation 



This rs. perhaps the sim 


plest of all the Dirichlet series 




} 


which play an important role in prime number theory. One reason 
for ttieir importance is exhibited in the following theorem, which 
gives a relation between the set of primes and the set of positive 

integers. 

. .*/ 

TftEjoREM 6-1. For s > 1, 

-A. / 

r r(s) = n(i-^) • a) 

p \ p / 

\ * * 

Pro^f: In less abbreviated form, the assertion is that 

4 « 

' / 1 \-i 1 

lim n ( 1 1) = lim T. ~~s' 

« p< V \ P / Ar-» » n=l 

4 

* See, for example, Volume I, pp. 9, 46, 59. 
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ao 

= 1 + X + + ••• = x^ 

n —0 


holds for |a:| < 1; since \p < 1, we have 

n (1 - = n (i + + --■). 

p< N p< N 


Multiplying out the product on the right, we obtain terms of the 
form where n runs over the integers composed exclusively of 
primes not exceeding N. Moreover, each such n occurs exactly once, 
by the Unique Factorization Theorem. The multiplication is per- 
missible, since the series involved are absolutely convergent, and 
the terms can be arranged in any order. Thus 

n (1 - = z'n-\ 

p<N 


where the accent indicates a summation, in the natural order, over 
all n such that p\n implies p N. In particular, the sum contains 
all terms n”* for which n < N. Hence 

n (1 - p“U"' = £ n-* + E' 

p<N n=l n>N 


and 


«0 

0 < E' w'" < E 

n>N n =iV+l 




1 

(s - ' 


since s > 1. Thus 


E' w ’ = o(i) 

n>N 


as ^ CO , and 


lim n (1 — 

N~* « p < N 


To see exactly how f(5) 
standard result. * 


N 

p-By-l ^ ^ ^ 

N—* » n =1 

behaves as s we use the following 


Lemma. Suppose that Xi, X 2 , ... is a nondecreasing sequence tend-- 
ing to infinity, that ci, C 2 , . . • is an arbitrary sequence of real or 
complex numbers, and that f(x) has a continuous derivative for 

X > Xi. Put 

Cix) ^ Y. Cn. 

X«<x 

* See, for example, Volume I, Theorem 6-15. 
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Then /or x > Xi, 

E cJiK) = Ci.v)f{x) - 

Applying this with Xn = = lj/(^’) == •*' - obtain 




for X > 1. If we put (x) = x - [x], we have for s > 1, 



Letting x increase without bound and noting that 0 < (x) < 1, 


we have 




This expression for f(s) agrees with the earlier definition for s > 1, 
but it is also meaningful for 0 < s < 1, since the integral converges 
for all s > 0. It may therefore be thought of as defining ^(s) for 
s > 0, s 1. At any rate, (2) shows that 

lim r(s)(s - 1) = 1, (3) 


and a fortiori that 


lim ?(s) = 00 . 
8 -^ 1 + 



For the remainder of this section, let q and r designate primes of 
the forms 4/c + 1 and — 1 respectively. Define the function 
x(n) by the equations 

x(i) = 1, x(g) = 1. x(r) = -1> x(2) = 0, 


x(wn) = x(^)x(^) f®^ every pair of integers m, n. 


(A function which satisfies the last of these conditions is said to be 
completely multiplicative; it is entirely determined when its values for 
all prime arguments are known, since x(p“) “ (x(p))“-) Inasmuch 
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as n = 1 (mod 4) if and only if 2}n and the total number of r's 
dividing n is even, we have 


x(n) = 


0 


_ 1 'iJCn— 1) 


(-1) 


if 2|n, 
if 2jn. 


We now investigate the function 


oe 


L(s) = Z 


(n) 


n=l n 


8 


If we write to mean that la„| < for n = 1, 2, . . . , 

then 

- 

„=i n* n'Tin" 

for s > 1, so that the series for L(s) is absolutely convergent for 
s > 1. More than this is true, however: Ihe series for L(s) converges 
for s > 0. For we note that for any n > 0, ■ 

x(^) + x(^ + 1) + x(n + 2) + x(^ + 3) = 0, 

so that we have 

N 

E x(n) 

n =1 

4 8 4IiAr| N 

N 

= 0 + 0H i-0+ E x(«), 

n=4(iiVl+l 

and hence , 

iV 


E x{n) 

n «■! 


< 1. 


9 • 

%9 




The truth of the assertion is therefore a weak consequence of the 
following theorem, which is due to Abel. 

Theorem 6-2. If [an] « « sequence of constants for which 

E = 0(1) 

n *1 

as N °o, and if | bn («) 1 a sequence of positive-valued functions 
which converges monotonically and uniformly to zero for s in some 
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interval J, then the series 


oo 


n = l 


converges uniformly for s in J . 
Proof: Put 


n 


A =1 

SO that \An\ < A for some A and all n. Using the monotonicity of 
bn{s)y we have 


(s) 

n=i 


2Z -'■4^ n— 1 )^n ( 5 ) 

n=i 

j: Anibnis) - bn^lis)) + A M) - Aj.ibj{s) 

1 n =y 

< A{bj{s) - 5fr(s)) + A5;t(s) + Abjis) = 2Abj{s). 

By hypothesis, this upper bound can be made uniformly small, for 
s in' J, by taking j sufficiently large. This proves the theorem. 

Here we have a situation which does not arise in the case of power 
series. For while a power series converges absolutely at every interior 
point of its interval of convergence, the Dirichlet series for L{s) 
converges for s > 0, but converges absolutely only for s !> 1, since 

the series 


GO 


E 

n — 1 


x(^) 


n 


= E 


1 


n=o 2n + 1 


diverges. 

On account of the complete multiplicativity of x, we have 




1 + 


x(p) , (x(p)) 


V 


+ 


2s 




= 1 , X 


p 


ip^) 


+ 


« « 


+ 


• « 


p p" 

♦ 

4 

Using this idea, the proof of Theorem 6-1 can easily be modified to 
yield 

Theorem 6-3. If f is completely multiplicative, and the series 

n =1 ft 
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converges absolutely for s > sqj then 


for s > So. 

Corollary. 



For s > ly 




We are finally in a position to prove Dirichlet^s theorem for primes 
of the form + 1. Let s be greater than 1. We have 

f (s) = n (1 - = (1 - 2-»r' n (i - g-^r^ n (i - 

P Q ^ 


and, from the corollary to Theorem 6-3, 

L{s) = n(i - n (1 + r-^r^. 

Q r 

Hence 

f (s)L(s) = (1 - 2-’)-! n (1 - n (1 - (5) 

Q r 

Now, for s > 1, 

111 .1^2 

Lis) = l- y + ^- ^+ -->l-3. ^ 3 ’ 

and so 

lim t(s)L(s) = 

by (4). If there were only finitely many primes g, the expression 
on the right side of (6) would remain bounded as s > l^j since 
for s > 1, 

n (1 - r-2»ri < n (1 - r-^)-^ < n (i - = r(2). 

r r P 

This contradiction shows not only that there are infinitely many 
primes g, but also that they occur sufficiently frequently that 

lim n (1 — 

•-♦l* Q 

The proof which has just been given contains most of the essential 
features of the general proof. The major formal difference which will 
arise in the general case is that we shall have to consider a number 
of functions like x above, and each will have an associated Dirichlet 
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series, some aspects of whose behavior must be investigated. The 
most difficult part of the proof lies in showing that these series do 
not vanish at s = 1, a point which caused no trouble in the present 

case. 


PROBLEM 

Let = 1 or 0, according as the equation n = x- + 2 /^ has or does not 
have a solution in integers x, y. It is known* that = 1 if and only if 
every prime r = — 1 (mod 4) which divides n occurs to an even power in 
the canonical factorization of n. Show that the series 

£ h 

n=l 

converges for s ^ 1, and di^'^erges for s ^ 1. \IIiui: Lstablish a relation 
among f (s), L( 5 ), and the square of the given series.] 

6-2 Characters. We recall that the elements of a reduced residue 
system (mod k) form an abelian group under multiplication (mod k), 
which we designate by ^l{k). The number of elements of M{k)j 
called its order, is ip{k ) ; hereafter we shall use h as an abbreviation for 
<pik). 

One of the fundamental theorems on finite abelian groups is that 
every such group has a basis: if it is a multiplicative group, this 
means that there is a set of elements . . . , such that every 
element of the group can be written uniquely in the form 

where each Xi is one of the integers 0, 1 , , ord Ai — 1 , and ord Ai is 

the order of the cyclic subgroup generated by A^. Moreover, the 
product of all the numbers ord A^ is the order of the group. The 
following theorem, for which we give a proof based on the theory of 
primitive roots! is a special case. 

Theorem 6-4. (a) Let k — pi^i • • • where p,- 9^ pj and each 

of the prime powers p^^ has a primitive root, say Qi- Then the numbers 


* See, for example. Volume I, Theorem 7-3. 
t See, for example, Volume I, Chapter 4. 
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Aij . . . , Ar form a basis for M (k) if^ for each t, 



Qi (mod 
'1 (mod 


if j ^ i, I < j < r. 


(b) Let k = where a > 3, and let gi be a 'primitive 

root of pi^' for 2 < i < r. Then the numbers Aq, Ai^ . . . j Ar con- 
stitute a basis for M(k)j where 


Aq 

Ax 


— 1 (mod 2") 

1 (mod f^ 2 < z < r, 

5 (mod 2^*) 

1 (mod f^f' 2 < z < r, 


and for 2 < z < r, 



gi (mod 
1 (mod Pj^^) 


for j 9^ iy I < j < r. 


Proof: Let a be relatively prime to k. Then it is also prime to every 
divisor of ky so that there are unique elements ai, . . . , of 
M(pi“i), • • • » respectively, such that 


a ^ a\ (mod 



a = ar (mod Pr"*’)- 

Conversely, for any choice of ai, . . . , Or in » ^iVr 

respectively, the system (6) has a solution a which is unique modulo ky 
by the Chinese Remainder Theorem, and a is prime to k. Moreover, 
if a is the solution of (6), and if, for 1 < z < r, is the solution of the 

system 

ai (mod rj^ 

1 (mod for j 9^ iy 1 < i < r. 


bi^ 


then 

bx- 


. ^ = 1 . . . 1 . a* • 1 ' * ' 1 ^ a,- (mod for 1 < z < r, 


so that 


a s bi • • • 6r (mod A:). 



(Thus, in the language of group theory, M{k) is the direct product 
of Af(pi"0, . ■ • , MCpr"')-) 
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Now if bas a primitive root gi, and 

\gi (mod Pi"') 

"" |l (mod pj^^) for j i, 1 < i ^ 

then, since 

a.. ^ ’(mod p.“’), 

we have that 

5 . = (mod p;"-'), for 1 < j < 

and hence 

bi = (mod k). 

Thus by the congruence ( 8 ), if all p,“‘ have primiti\e roots, 

a = (mod k), 

and this representation is unique if each index is given its smallest 

non-negative value, so that 0 < ind a,- < ¥’(Pi )• i j c „„ 

On the other hand, if pi“‘ = 2“ with a > 3, then - an c 
stitute a basis for M(pi“i). For* 5 is a primitive X-root of 2 , so that 

the numbers 

5, 52 , , 52 “ 

are distinct (mod2“); since they are all congruent to 1 (mod 4), 
and since there are exactly 2“ ^ numbers in a reduced residue sys- 
tem (mod 2“) which are congruent to 1 (mod 4), these must be e 
numbers. Likewise, their negatives are all the numbers congruen 
to -1 (mod 4) in a reduced residue system (mod 2“).t Hence, if o 
is in M(2“), then, for some choice of xq and xi, 

a = (-ir<>5’'i (mod2“). 

Thus if Ao. . . • , Ar are dehned as in part (b) of the theorem, we have 

a ^ (mod fc), 

and the representation is again unique if we require that 

0 < Xo < ord Aq = 2, 

0 < xi < ord Ai = 2“"^, 

0 < ind ai < ord Ai = v’(P*'**)* 

* See Volume I, Theorem 4-9. 

t A similar argument is used in Volume I, in the proof of Theorem 5-1 
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Notice that in the two cases we have 

ord Ai ' • • ord Ar = 

ord Aq • ord • • • ord Ar = 2 • 2““^ • (p(p 2 ^^) * * • '^(Pr“0 = 

To obviate the distinction between cases (a) and (b), we rename the 
basis elements . . . , and put ord Bi = hi for i = 1, . . , ,m. 

A complex-valued function x, defined over the group AI{k) (more 
generally, over any finite abelian group), is called a character (mod A:) 
(or a character of the group) if it is completely multiplicative and not 
identically zero, that is, if 

x(a6) = x(o)x(^), for a and b in M(k)j 

x(a) 7^ 0, for some a in M (k). 

Since in the group M{k) we identify integers which are congruent 
(mod k)y we have 

xM = x(«0) if a = (mod k) and (a, k) = (a'y k) = 1, 

so that one could also think of characters as being defined over the 
residue classes themselves. Notice that necessarily x(l) = since 
for any a for which x(«) ^ 0, we have xW ~ x(a • 1) = x(a)x(l)- 
Moreover, if a is in M (k) and ord a = tj then 

(x(a))' = x(aO = x(l) = 1. 

Since t\hy it follows that every value of every character is an Mh root 
of unity. 

On account of its complete multiplicativity, any character is totally 
determined when its value is specified for each basis element Bj. 
Thus the characters are contained in .the set of all completely mul- 
tiplicative functions over AI(k) for which 

x(Bj) = 0<Pj < hjy (9) 

for j = 1, . . . , m. . But conversely, evejj'jf^uch funej^ion is obviously 
a character, and different choices of lead to different characters. 

Thus there are h different characters^ corresponding to the hi • -rhm 

different m-tuples (/3i, . . ■ , /3m)- i ■ , > u ; 

Two groups G and with elements a, . . ..and . , are 

said to be isomorphic if it is possible to find a pairing of elements of; 6 
with elements of G', such that each element of G correspond^ to . 
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precisely one element of G' , and conversely, and such that if a ^ a 
Ld h ^ h', then ab ^ a'b'. In this ease the groups are abstiaaly 

identical, and any theorem concerning one group has an immediate 
analog for the other group. To construct such an isomorphism 
between two finite abelian groups, it suffices to find a one-to-one 
correspondence of basis elements such that corresponding element 

have equal orders. F- let the bases be C. .^. . , ^ . 

SO named that ord Ci — oid Ci , lor i i • • t 
make a and a' correspond if 


a = Ci"‘ 


C/* 


and a' = ■ ■ ■ C/"*, 


0 < x, < ord Ci. 


For if also 


h = • • • C/ 


and 


b' = Ci'^‘ ■ ■ • c/^'. 


then 


ab = ■ C 


$ > 


a'b' = Cl -Cs 




and 


{abY = Ci'"*+"‘ • • ■ 


Moreover, this is a one-to-one correspondence, since the representa- 
tions by basis elements are unique for the ranges 0 < aq < ord C . 

ord Ci\ \ < i < s. u + -i/ 

For the'basi; Bi, • • • . B,. of M{k), define characters xi, ■ • • , x.. 

as follows: 


X.(B;) = 


1 


if j = 
if j M 


1 < j < m. 


( 10 ) 


Then from the sentence containing equation (9), we see that every 
character can be represented uniquely in the oim 

A 


X = Xl'^‘ • • • Xnf"' 


0 < 0i < hi for f = 1, • ■ • . w. 


/V 

.... ('Wa sav that two characters are 

since this gives \{Bj) as i '■’h . xu a 

equal if they have the same v.. ue for every element of the group, and 
define the product of two characters as the function whose values 
are the products of the component values; tins unc ion is a so a 

charact^ by the sentence following (9).) Under multiplication, the 

characters form a group X{k), having basis xi, ■ • • > Xmi since 
ord>, = ordB., the groups X{k) and M(/c) are isomorpluc- The 
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unit element of X(k) is the character Xqj principal character^ 
such that Xo(^) = 1 for every a in M{k). 

We summarize the chief results obtained so far. 


Theorem 6-5. There are h distinct characters (mod A:), and these 
form a group X{k) which is isomorphic to M{k). Every value x(<^) 
is an hih root of unity. The characters xi» • • • » Xm defined in (10) 
form a basis for X{k). 

We shall also need the following result. 


Theorem 6-6. If x in X (A:), then 


E x(a) = 

aGA/(ifc) 

while if a is in M(k), then 


h 

0 


if X = Xo, 
if X Xo, 


Proof: 


E x(a) = 

xCW) 

We have 



if a ^ 1 (mod A:), 
if a ^ 1 (mod A:). 


E Xo(a) = E 1 = 

aCLM{k) adMik) 

If ^ then for some a in M(A:), x(3) 9^ 1- For this 3, 

x(3) E xW = E x(«)x(a) = E x(a5), 

a a a 


and, as a runs over a reduced residue system, so does aa, so that 

x(3) E x(a) = E x(«), 

a a 

H x{a) = 0. 

a 

If a 7 ^ 1 (mod A:), and a = then some Xi 9^ 0. For 

this i, x»(®) ^ 1> and 


Xt(a) E x(«) = E Xi(a)x(a) = E x/(«)» 

X X 

where = X*X- As x runs over X{k)j so also does XtX = x/» and 

X.(a) E x(a) = E x(a), 

X X 

S x(a) = 0. 
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x(a) has so far been defined only for arguments relatively prime 
to k. For simplicity in later formulas, we define 

x(a) = 0, if (a, k) > 1. 


This does not affect the validity of Theorem 6-fi. 

The duality of the relations of Theorem 6-6 is a reflection of the 
isomorphism of X(k) and Mik). In a sense, the reason for the 
importance of characters in the investigation of primes in progres- 
sions lies in the second relation, since it singles out the elements of 
a particular residue class (mod A:), so that by use of the relation 

Z 9(«) =7 Z 9(a) T. x(a), 

u< a<v ^ a <» X 

a s 1 (mod k) 

sums can be extended over an entire interval instead of a finite or 
infinite arithmetic progression kt + 1. Moreover, by a slight modi- 
fication, any other residue class can be distinguished in the same way. 


Theorem 6-7 


If (a, k) 

y x(a) 
^rx(k^ x(h) 


{bj fc) = 1, then 
h if a ^ b (mod /c), 

0 otherwise. 


Proof: Choose c so that be = I (mod k). Then 

x(a) 


E 






and, by Theorem 6-6, the last sum is or 0 according as ac is or is 
not congruent to 1 (mod k), that is, according as a is or is not con- 
gruent to b (mod k). 


It should be noticed that the function 



[0 


for n odd, 
for 71 even, 


introduced in Section 6-1 is a character modulo 4. 
cipal character 



V 


for n odd, 
for n even, 


It and the prin 


constitute the group X(4) of order y»(4) = 2. The correspondence 

xo 1» X ^ 3, 

describes the isomorphism between X (4) and M (4) ; each is the cyclic 
group of two elements. 
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6-3 The L-functions. For each character x, we define a function 
L(s, x) for 5 > 1 by the equation 

x{n) 


L(s, x) = E 

n =1 


n 


or equivalently (according to Theorem 6-3) by the equation 


Lis, x) = n ( 1 — 

p 


P 


(11) 


In particular, 

Lis, xo) = n (1 - = n (1 - p-’)f(s), 

v\k 

SO that, by equation (2), 


Lis, xo) = n (1 

p\k 


p-n 




CO 


t 


s+1 


dt 


( 12 ) 


This latter representation for L(s, xo) is consistent with the series 
definition for s > 1, and may be taken as the definition for 0 < s < 1. 

For the proof of Dirichlet^s theorem, it is necessary to know some 
of the properties of these //-functions. All the relevant properties 
can be proved by elementary arguments, but the proofs frequently 
can be simplified considerably if use is made of the theory of func- 
tions of a complex variable. In these cases alternative proofs will 

be given. 

Theorem 6-8. L(s, xo) continuous for s > 1, and 

lim (s — 1)L(5, xo) = 7 ' 

8-^1* ^ 


1 “1 
« 5: — = f(«o), 


Proof: For s > sq > ly 

L(s, Xo) = 

(n,k) =1 n n 0 

so that the series for L(s, xo) converges uniformly in any interval to 
the right of 5 = 1. Since the separate terms are continuous, the sum 

is also continuous. Moreover, by (12), 


<p(fc) 


lim (s — l)/^(5, xo) — n(l~P^)“ , 

Pl* 


h 

k 


For X ^ 


Xo, Theorem 6-6 shows that for arbitrary no^. 

no + A 

E x(n) = 0, 
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SO that by grouping the terms of 


E x(«) 


in blocks of A:, with perhaps part of a block left over, we see that 


L x('0 


< h. 


It follows from Theorem 6-2 that the Dirichlet series for L{s, x) is 
convergent for s > 0. We need a slightly stronger result, which is 
proved in the following theorem. 


Theorem 6-9. If x ^ Xoj L(s, x) ficis a conimuous derivative 
{and is therefore itself continuous) for s > 0. 


Elementary proof: We use the standard theorem from analj^sis, 
that if the series resulting from termwise differentiation of a given 
series converges uniformly over an interval, then its sum is the 
derivative of the original series. The termwise derivative of 





x{n ) log n 

7 




and for 0 < so ^ 5 < si, the result follows from Theorem 6-2 by 
taking an = x(^)>5„(s) = log n. But sq may be arbitrarily 
small, and si arbitrarily large, so that every s > 0 can be included in 
an interval in which L(s, x) is continuously differentiable. 


Alternative proof: Applying Theorem 6-2 and the fact that 


00 



n *1 




7 


where o- = Be s, we see that the series for L(s, x) is uniformly con- 
vergent for Re s > o-Q > 0. Since each term of the series is an 

analytic function of s, the sum is also analytic, and is therefore 
differentiable. 


6-4 Nonelementary proof of Dirichlet’s theorem. There is a proof 
of Dirichlet s theorem which is remarkably simple and illuminating, 
and which fails to be elementary only in the sense that logarithms of 
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complex numbers are used. If the student who is not familiar with 
this extension will assume that the usual properties of logarithms of 
positive numbers (including the form of the Maclaurin expansion of 
log (1 + x) for lx| < l) carry over to logarithms of nonzero complex 
numbers, he will find this proof much more straightforward than the 
elementary proof given in the following section, where use is made of 
the relation 


— log/(x) = 
ax 


fix) 

fix) 


to avoid logarithms entirely. 

For s > 1, IxCp)//?"*! < 1) so that for such s we can describe a 
branch of the function log (l - xip)/p") by the equation 



00 


= - z 


x(pn 


m 


=1 rnp 


ms 


By (11), this induces the choice 


«o 


log Lis, x) = S 


xipn 


p m = 1 Wp 


ms 


for S > 1 


Theorem 6-10. For each x, the function 

F(s, x) = logL(s, x) - Z 


x(p) 


(14) 


(15) 


is hounded in absolute value for s ^ 1 . 
Proof: We rewrite (14) in the form 

x(p) 


00 


logL(s, x) - 51 


xip^) 

-\-y ^ ■ 

P p m-=2 ^P 


Here, 


i: z « z z 


= rZ 


ms 


p m 


Wlp p ^ p 


2p™’ 2 7 p^’(l - p-’) 

1 (1 - 2-*)“‘ 


"®^2?p"‘(l - 2-') 
(1 - 2-')-i 7. 1 

« ^ 


V_L 


(1 -.2-') 


s\—l 


f(2s), 


and since f (2s) is bounded for 2s > 1 + «, the theorem foUows. 
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We can now complete the proof of Dirichlet’s theorem, except for 
one gap which will be considered later. 

Theorem {Dirichlet's theorem). If {k, 1) = 1, then there are 

infinitely many primes of the form kt + 1. 

Proof: Multiply equation (15) by l/x(0 and sum overall x in 
X{k). This gives 

log L{s, x) = ^ ^ x(p) ^ ^(®. x) 


x(0 


xiDp 


x(0 


_ V — V 4- V 
“ p P* X X(0 X x(0 


and, by Theorem 6-7, 


^ logL(s,x _) z ^ + 

X (0 p (mod /:) P »■ X (0 


(16) 


^ yv y ^ .•/ ft X 

Let s ^ 1"^ in (16). The second term on the right remains bounded, 
by Theorem 6—10. We know that 


so that 


lim L(s, xo) = 

logL(s, xo) 

hm = ^ 

xo H) 


Suppose for the moment that it had been shown that the remaining 
functions L(s, x) (which we know to be continuous at s = 1) have 
nonzero values L(l, x) at s = 1. It would follow that 


lim 51 

«“♦!* X^XQ 


log L(s, x) 
x{l) 


< °o. 


s CO 


and (16) would then imply that 

lim S — 

8 -^ 1 + ptal{modk) V 

an equation which is possible only if the sum has infinitely many 
terms. Thus when we show that L(l, x) ^ ^ ^ ^ Xo» we shall 

have proved not only Dirichlet^s theorem but the stronger result that 

the series 

E - 

p*l(modife) P 


diverges. 
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6-5 Elementary proof of Dirichlet’s theorem. It is possible to 
avoid the complex logarithm logL(5, x) by using its derivative 
instead: 

d , , , , L'(s, x) L' , , 

— log L(s, x) = r = -r («, xJ- 

dx L{s, X) L 

A . 

If we could use the relation (14), we could immediately deduce that 

L'{s, x) ^ “ x(p”*) logP . 

L{S,X) pm=l P"" ’ 

since we cannot, we arrive at the same result by the rather more 
awkward method of dividing L'(s, x) by L{s, x). In the process, 
we shall have occasion to use some properties of the Mobius /x-func- 
tion, which is defined by the following relations: 


^l{n) = 


1 if n = 1, 

0 if n is divisible by a square larger than 1, 

( — !)'■ if n is the product of r distinct primes. 


Alternatively, m is. the multiplicative function (that is, /i(mn) - 
/i(w)jLi(n) whenever (m, n) = l) such that 


M(n) = 


1 

-1 

0 


if n = 1, 

if n = p, a prime, 

if 71 = ot > 1. 


The properties we shall need are these. * 

1 if 71 = 1, 

0 if 71 > 1. 


(a) S 

din 


(b) If / is any number-theoretic function and 


F(n) = S f{d), 

d\n 


then 


J{n) = L (:^) * 

d|n 


.moxv r 


Theorem 6-12. If f is a completely multiplicative funciiouj and the 


series 


00 




/(n) 


n=l ri 


See, for example, Volume I, Theorems 6-5 and 6-6. 
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converges absolutely for s > So» then 


f (M)~' = 

,n=l / n=l 


for s > So- 
Proof: We have 


f{m) £, f(n)fi{n) 


QO 


m 


=1 rn'' „=i n 


m,n = 1 


f{mn)ii(n) 

(mn)^ 


„ EM(rf) 

J=1 J 


= 1 


Theorem 6-13. For each Xj the relation 

L\ , x(n)A(n) 

(s, x) = - 2- "1 


(17) 


n =1 


holds for s > 1, w/iere 

log p if n — for some a > 0 and prime p, 


A(n) = 


0 


otherwise. 


Proof: By the preceding theorem and 
L'(s, x), we have, for s > 1, 

L' . ” x(^) log m 

(s, x) = - I- - 


the expression (13) for 


CO 


n 


=1 m 


E 

t =1 


(i)MO') 


•s 


= - E 

m.y = 1 


\ogm 

{mjY 

„ x(n) E M(rf) log 3 

= - E — 

n =1 


n 


But from the ou . ‘S relation 


log n = X) A(d) 

d[n 

and the Mobius inversion formula quoted above, we have 

A(n) = E M(d) log^' 

din " 


and the theorem follows. 
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Theorem 6-14. For each x* the function 

G{s, x) = ^ (s, x) + E x(p) 

Li p V 

is bounded in absolute value for s > 1. 

Proof: Equation (17) may be rewritten in the form 


(18) 


and 


L' , . ^x(p)logp ^ x(p")lQgP 

— (s, x) = - 2- 7 )"** 

L p V V m=2 V 

_ “ x(p")logP V f = 'T 

? £2 v"'^ 7 --2 P'"' P P^“(i - P“') 


^ 28 


log V 


p2«(i _ 2“*) 

and the last series clearly converges for s > ^. 

We can now complete the proof of Dirichlet's theorem m much 

the same way as before. Multiplying both sides of (18) by l/x(0i 
and summing over all x '^e obtain 

y' 1 — (s x) = - E E + E G{s, x) 

ir P x(0 P* X xil) 


= - h z 


^p + e-4(?(^,x). 


p* / (mod k) V 


x(l) 


Now let s 1+ The second term on the right remains bounded. 

Assuming again that L(l, x) ^ 0 for x xo, the quantity l/L(s, x) 
is also bounded for s sufficiently close to 1 , since L is continuous at 
s = 1. For X 5^ xo, L'(s, x) remains bounded, by Theorem 6-9. On 

the other hand, 


L' 


(s, xo) 


A in) 53 JL 

(n,*) = 1 W* p+* P* P+* P 


= E 


= log I/(s, xo) + Xo), 


by Theorem 6-10, and the quantity log L(s, xo) + ^(«> Xo) increases 
without bound as s !+• It follows that 

logp 


lim E 

#-»!♦ p-*(mod*) V 


s oo 
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and the theorem is again proved, except for the verification of the 

fact that L(l, x) 5^ 0 for x 5^ xo- 


6-6 Proof that L{\, x) ^ ® 

Theorem 6-15. If x assumes a nonreal value for some n, then 

L(l, x) 5^ 0. 

Proof: Let x be such a character, and let x be the function whose 
value for each a is the complex conjugate of that of x- Clearly x is 
also a character, and x X- But if L(l, x) = O- then also 

L(l, x) = L(l, x) == 0, 

so at least two L-functions must vanish in this case. Since L(s, x) is 
differentiable at s = 1 , the quantities 


L'(\, x) 


lim 


Ljs, x) 
s — 1 


and 




lim 

«-»i 


Ljs, 

s — 1 


exist, so that there is a number A such that 

n L{s, x) 

.. x^xo 

lim 


«— * 1 


(s - 1) 


= A. 


Since 


h 

lim (s — l)L(s, Xo) = r ’ 


we deduce that 


n Lis, x) 


lim n L{s, x) = lirn i (s — l)((s — l)L(s, xo)) 


X 5*^X0 


X 


= 0 • “ * .A — 0. 
k 


(S - 1) 


Ij log L{s, x) 

X 


= SL E 


xipn 


ms 


X p m=l 

, ZxiP’") 

Z E — 


P m«l wp 






p,m Tfljp 
(mod *) 


ms 


> 0 



But by (14), 
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lim n L(s, x) > = 1* 

«-»!+ X 


This contradiction establishes the theorem. 


It would not be easy to avoid the use of the complex logarithm in 
this proof, since the Dirichlet series for nx^(5> x) has very complicated 
coefficients. To obtain an elementary proof, it is simpler to use a 
different combination of L-functions. Unfortunately, the choice we 
make can hardly be motivated by an elementary argument, but must 
remain a deus €x mcchinci until Section 7—3. It is the left side of the 

inequality 

L^(s,xo)IL(s,x)n^(s,x")l" > 1; (19) 


this inequality we now show to be valid for s > 1. 

Note first that for 2 = r (cos 6 i sin 0), 

|1 _ 2|2 = |i _ 7- cos 0 — zr sin e\^ = I — 2r cos 6 + 
and that for arbitrary real 6, 

2 cos 0 + cos 20 = 2 cos 0 + 2 cos^ 0 — 1 = 2(cos 0 + f > — f* 

Using the fact that the geometric mean of three positive numbers is 
at most equal to their arithmetic mean, we see that, if 'p\k and 


x(p) = cos Bp + i sin 0p, 


then 


1 


(p) 


| 2\2 


1 - 


(p) 


P 


P . . . 

= (1 — 2^“* cos 0p + p ” 2p * cos 20p + p 

(1 _ -|p“*(2 cos 0p + cos 20p) + P“^*)^ 


'2s^ 


< (1 + p-* + V 


or 


(1 - Xo(P)P“’)®ll - X(P)P '1^11 - X^(/!!.i(l M ^ 1- 


This inequality also holds if plfc, and, multiplying over all p, we obtain 


(19). 


It is now simple to prove that L(l, x) 5 ^ 0 if X^ Xq, that is, if X is 
nonreal. Supposing the opposite, and using the fact that L (s, X) is 
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continuous at s = 1, 've have that for 1 < .s < si, 


\L{s, x)l = \L{s, x) - L(l, x)l - 



x) 


< A,{s - 1), 


where 


Ai = max \L\sj x)! 

1 <5 < 5l 


But now (19) can be recast in tlie form 

L{8,x)'^ 


(s - l)((s - l)L(s, xo)) 


s — 1 


X-)!- > 1. 


in which the first factor tends to zero and the otliers remain bounded, 
as s ^ 1+. This inequality is false for some s > 1 , and the contradic- 

tion shows that L(l, x) ^ ^ \ ^ i 

No device of this sort has been found for the case that xW is 

for all n. Showing that L(l, x) ^ 0 for a real character is the most 

difficult point in the entire proof. Dirichlet effected it by showing 

that L(l, x) is a factor in the class number of a certain quadratic 

field. This and other algebraic proofs require a considerable amoun 

of background; w'e shall content ourselves with an elementary and a 

function-theoretic proof. We first sketch the idea. 

If s > 1, then 

, “ 1 “ x(n) " x(n) ^ y- -It 

!:(s)L(s, x) =21 : 22 . — 21 • '• ^ 


", m * „=1 


m 




i = 1 


t 


so that if we put 


then 


/(n) = Z xid), 

d\n 

* /(n) 

r(s)L(s,x) = i: 

n=l ^ 


( 20 ) 


( 21 ) 


for s > 1. 

By Theorem 6-17, below. 


Z E 


1 


= !:(2s), 


n=i 71'“ T^r^i {m^y 

so that even xi series ll/(n)n.''^ converges to the right of s = 2 > 
it is certainly not bounded near s = ^. In the analytic proof, we show 
that (21) is correct for s > ^ if L(l, x) = 0, and obtain the con- 
tradiction 

lim L{s, x)r(s) = L(^, x)r(T = 
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In the elementary proof, questions of convergence are avoided by 
considering partial sums for s = ^ rather than the full series for s 
near It will be shown that 

X f(-n') 

E -V = 2Vi L(l, x) + 0(1), 

n=i vn 

and also that the sum on the left tends to infinity with Xj so that the 
relation L(l, x) = 0 is impossible. 


Theorem 6-16. Withf as in (20), 


f(n) > 


0 for all n, 

1 for square n. 


Proof: Being the arithmetic sum function of a multiplicative func 
tion, / is itself multiplicative,* so that 

• • • Pr°') = • • •/(Pr“’’)- 

Since x is a real character, x(p) = 0 or ±1 for each prime p, and 


a 


/(p“) = H x(P^) 

0=0 


E (x(p))^ 
0=0 


1 + 0 + 
l + 1 + 
1-1 + 


• • 


+ 0 

+ 1 

+ (- 1 )“ 


if x(p) 
if x(p) 
if x(p) 


0 , 

1 , 

- 1 . 


Hence 


/(P“) = 


1 

O' + 1 
1 
0 


if x(p) 
if x(p) 
if xCp) 
if x(p) 


0 , 

1 , 

— 1, a even, 

— 1, a odd, 


and the theorem follows. 

Theorem 6-17. The relations 


Z x(n) 

n^l 


and 


z 

n 


x(n) 


n 


= 0 ( 1 , 


"«(?) 


( 22 ) 


(23) 


hoUasx-^ co,fors> 0. 

* Cf. Volume I, Theorem 6-3 
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Proof: We have already noticed that if 

S{x) Y. x{n), 

n 

then lS(x)l < h, which implies (22). Using this, we have 


«0 


i: 


xM 


n=x n 


00 


z 


n 5= X 


S(n) - S(n - 1) 


n 


S 


00 


E S(n) 


1 


Six - 1) 


n =x 


(n + 1) 


X 


2h 

S 


” /j_ _ 1 \ ^ ^ 

“ W in + 1)V X® X 

which implies (23). 

Theorem 6-18. There is a constant C such that 

i: = 2Vi + c + o(-^) 

vn \vx/ 



dx 


Proof: Put 

tn = 2Vn — 2\/n — 1 ^ ^ 

Vn Jn-iVx 

so that 


1 


a/ n 


— y 


i:t„ = 2Vx - 2 - £ 

n=2 «=2 Vn 


Now being the area of the triangular region bounded by the curve 
y = x~^ and the lines x = n — 1 and y = n"^, is positive and smaller 
than (n — 1)“^ — n~^, so that the series 


00 


Z t 


converges, and 


n =1 


( 


00 


00 


1 


1 


1 / ^ 

n=x+l n=x+l \Vn — 1 Vn 



X 


Hence 


2V^-Z-^=l+Z<n- £ i„ = l+ Z<n + of ^ 

Vn n=2 n»i4-1 n =2 ' "V X 


This proves the theorem for integral x; its extension to real x is 
immediate. 
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Theorem 6 - 19 . If x ^ xo is 

a real character ^ then L(l, x) 0. 


V 


Proof: Put 


^ f(n) 
G{x) = E -7= 

n = i vn 


By Theorem 6 - 16 , 


Vx 

G(X) > E 

m 


1 


•v/X J 

= 1 \/ rri? m =1 ^ 



u 


so that G{x) — ^ 00 with x. 
On the other hand, 


Figure 6-1 


G{x) = E -^Ex(rf) 

y=i Vj d|> 


x(y) 


i: 

ur< X V WV 


This sum, extended over the lattice points w, v for which u > 1 , 
y > 1, wy < :c, we split into two parts, as indicated in Fig. 6 - 1 : 


\/x x/u 

G{x) = E Z 


xiv) g g xCv) 


u 



1 ® = -n/x-{-i y / uv t> = i u = i 



wy 


/— /— "t” y- 

u«i Vwt?=v/x+i Vy 


= 1 “s/y u=i \/u 


= i: 

t*=i 


Vi 


1 vx 

— 0 (x'~^) + S 





u 


=1 Vy 



+ C + 0 



= Oix-i) ■ 0(xi) + 2v^ E — + C ■ 0(1) + ^ 0(1) 

v=i V y X 


= 2 y/x ^L(l, x) + ^ 


so that 


* f(k) 

S = 2 VxL(l,.x) + ^( 1 ) 

n=i yn 


(24) 


00 


Thus, if L(lj x) were zero, 6^(x) would remain bounded as x 
which is not the case. 

A rather more straightforward (function-theoretic) proof can be 
obtained by extending (21), which we know to be valid for s > 1, to 
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the range s > |, under the assumption that 7^(1, x) = 0. By an 
argument quite similar to, but slightly simpler tlian that which yielded 
(24), it can be shown that if L(l, x) = 0- then 

i: fin) = oiVx). 

n =l 


Theorem 6-2 implies that the series 


n=i n® ^ 


" (^(1) 

2^ c_i- 

n = i n ^ 


converges for s > ^. Now let o-q be a real number greater than 
and let s be a complex number with Be 5 = cr > a-Q. Then for 
V > u > 1, we have 


E 


n *u 


fin) 


= E 


n =u 


f{n)/n'’o 




= E 


y /(^) 

1, m=i m'o w‘'o 


n =u 


^Tq 


’v f (M. (J: 

n=um=i m"o (n+l)'-"" 


, . ® /(^«) 


r, m'^o 


m = 1 


i' ^ /(w) 

(«— ao) y- i , 

m=l 


SO that 


L 


fin) 


=,x n 


n 


r — 1 


< A Z 


„=u n‘-‘'o 


(n + 1) 




t—1 


= A Z (s 


n — u 


< A "z -,-4+T + 

nasu 


“"“'I 


n+1 


dz 


2*— ao+l 




<T— (Tq+I 


where A is such that 


n=l n"0 


for all ^ 1- 
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It follows that the series 



converges to an analytic function in the half-plane a > ^. Since it 
coincides with L(s, x)f(s) for s real and larger than 1, it represents 
an analytic continuation of L{s, x ) f (s) f or o- > But it is unbounded 
near s = ^, while L{s, x)f (s) Is not. 



CHAPTER 7 


THE PRIME NUMBER THEOREM 


7-1 Introduction, 
texts* that if 


It is shown in elementary number theory 


lim 



7r(x ) 

x/log X 


exists, it must have the value 
c and c such that for x > 2, 

c < 


1, and that there are positive constants 

7r(x) ^ 

x/log X 


Neither of these results implies the other, of course; together they 


show that 


0 < lim inf 


TT 


(X) 


«o 


x/log X 


< 1 < lim sup 


OQ 


7r(x) 

x/log X 


< «>. 


(Here, as always, 7 r(x) denotes the number of primes less than or 
equal to x.) Both results were obtained by Chebyshev in 1851 and 
1852 (in rather more precise form), but it was not until some forty- 
five years later that the final link was supplied by Hadamard and 
de la Valine Poussin, who showed independently that the limit 
actually exists, and thus proved the Prime Number Theorem. Both 
proofs made essential use of the theory of functions of a complex 
variable, and despite much effort it seemed for many years impossible 
to give a proof entirely free of considerations as sophisticated as 
this theory. In 1948, however, P. Erdos and A. Selberg gave a com- 
pletely elementary proof. More precisely, Selberg proved the funda- 
mental relation 


Y. log^ P + £ log p log g = 2x log X + 0(x), 

p<x 

and he and Erdos deduced the Prime Number Theorem from it.f 


* See, for example, Volume I, Sections 6-6 and 6-7. 

t Excellent expositions of this proof are given in T. Nagell, Introduction 
to Number Theory (New York: John Wiley & Sons, 1951) and in G. H. 
Hardy and E. M. Wright, An Introduction to the Theory of Numbers (3rd 
edition. New York: Oxford University Press, 1954). 
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We present a proof based on the behavior of the f-function for 
complex s. Throughout this chapter, familiarity with the contents of 
a standard course in the theory of functions of a complex variable 
is" presupposed. 

Before going into detail, we outline the proof. Our object is to get 
an estimate for 

7r(x) = E 1 = E P{n), 

p< X n =1 


where P is the characteristic function of the primes: 


P(n) = 


[1 

0 


if n is prime, 
otherwise. 


While P itself does not arise in a natural way, the function P* such 


that 



P*(n) = 


m 



if n = p”* for some m, p, 
otherwise. 


occurs in the Dirichlet series for log f ( 5 ): 


1 “ P*{n) 

log {-(s) = E — ;;:7 = E, 

m,p 


n=l 


( 1 ) 


For fixed m, the number of mth powers of primes which do not exceed 
X is equal to the number of primes which do not exceed Vx, so that 

E P*iri) = E Pin) + ^ E Pin) + ^ E Pin) + ■■■ 

n =1 n =1 


2 „=i 


3 „ 


ir(Vx) 7r(v^) , 

= 7r(x) -\ I Z I- 


and since, for m > 2, 


^(a-l/m) < cx 


l/m 


< cVx = o ( 

VOg X 


it is to be expected that 


XI ■P*(n) ^ 7r(x). 

n 




In light of (1), the present case is a specialization of the following 
problem: given a function 

/(s) = E ^ 

n-l n’ 
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It will be shown that 


X 


E 

n = l 




if 14J < 0, 



so that J{w) is closely related to the characteristic function of the 
positive real numbers. If we put 



this gives 



{x/nY 




[log (x/n) 

io 


if n < T, 
if n > X, 


so that 



^/(s) ds = log - • 

5 n<x 


If 5 = 5(x) tends monotonically to zero as x — > then 


^ x(l + 8 ) „ , X 

XL an log ^ an log - 

n< 1(1+5) ^ n<x 'a 

= log (1 + 5) XL + IL 

n<x x<n<x+ 5 x 


x{\ + 6) 


n 


= log (1+6) E «n + 0 (log (1 + 5) E 0.n 

n<x \ x<n<x+ 5 x 


If the remainder term here is of smaller order of magnitude than the 
first term for suitable choice of 5, then 

n5x ~ 27rf5 


Trih J 2 — 00 f 


/(s) ds, 


and the problem reduces to that of obtaining an adequate estimate 
of this integral. To do this, we replace the line of integration by a 
suitable large closed contour, inside and on which we have sufficient 
information about /(s) to apply standard contour-integral techniques. 

In the case at hand, the estimation of the integral in the last rela- 
tion requires some knowledge of the zeros, poles, and size of f(s). 



232 


THfc PRIME NUMBER THEOREM; 


[chap. 7 


7-2 Preliminary results. Following the odd but harmless tradi- 
tion in analytic number theory, we designate by and i the real and 
imaginary parts of the complex variable s. For x > 0, x* means 
e* where log x indicates the real logarithm. 

When we have proved the Prime Number Theorem, we shall con- 
sider some other rather similar problems, and for one of these it will 
be necessary to use not the Riemann f-function' but the so-called 
Hurwitz ^-functiorij defined for 0 < ^/; < 1 and a- > 1 by the equation 




Us, w) = JL 


„=0 {n + w) 


8 


Since f(s, 1) = f(s), and since the requisite properties are no more 
difficult to prove for f (s, w) than for f (s), we consider the more 

general function. 

Theorem 7-1. For any <to > I, the series 


Y, {n + w) * 

n=0 

converges uniformly for c > vq, so that f (s, w) is regular {or analytic) 
for O' > 1. 


Proof: We have 


|(„ + w)-‘\ = |g I = ^ (n + w) 


SO that for o- > <ro, 

V (n + « S (^ + 

n=0 n=o 

Thus we have a series of analytic functions . which is domnated 
throughout the region <r > <ro by a convergent series of positive con- 
stants, and which is therefore uniformly convergent, and the result 
follows from Weierstrass’ theorem. 

Theorem 7-2. If « ^ integers with b > a > 0, and if f 

has a continuous derivative over a < x < b, then 

^ y(n) = f f{u) du + f (u — [u])f'{u) du. 

n-a+1 •'® 
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Proof: We have 


r uf' 

•/n-l 



n 


(u) du = nf(n) - (n - l)/(n - 1) - / /(w) du 

♦/ n “ 1 

/(n)+(n-l)| f'(u)du- f(u)du 

n-1 ^n-1 




n 


/(n) + / [m]/'(m) - 

n-1 ^n-1 



n 


f(u) du, 


from which the result follows by summing on n from a + 1 to 6. 

Theorem 7-3. If m is a non-negative integer, and <t > 1, then 

1 /■“ u— [u] 


r(s, w) - 


(s— l)(m+i.o) 


7rx= L 


n=o (n+w;) 


— s 



m 


(u+w) 


5+1 


du. (3) 


It follows that f(s, to) - l/(s - 1) w regular for <t > 0, and that 
(3) holds for a > 0. 

Proof: If <r > 1 and 


f{u) = 


1 


{u + 'w) 


" 7 


S 


then the equation of Theorem 7-2 continues to hold if 6 , and, 

replacing a by rn, we have 

1 1 u — [u] 


E 


n *am 


_,_i (n + wj)* (s — l)(w + z/j) 




+ w) 


s+l 


duj 


from which (3) follows. Since 


u — [u] 
(u + 


1 


^ iu + wy~^^ ^ ' 


the integral on the right side of (3 con verges absolutely for cr > 0, 
and uniformly for o' > <ro > 0. For arbitrary n > 0, the quantity 

t 

in . (W + ^ in 


u n 


(u + 


s+l 


du 


is a regular function of s for o- > 0; the same is true of 

n+l 


n^mJn (u 


M 


+ w) 


H-1 


du 


m 


u — [u] 

{u + wY~^^ 


du, 



234 


THE PRIME NUMBER THEOREM 


[chap. 7 


for m > 0. Finally, taking m = 0 in (3), we have 


f (5, w) - 


1 


Si — I 


1 w 

= — H — 


1 — 3 


'll) 



u 




/k 


M ,, 

\H-1 


and the right side is regular for a- > 0. 

Equation (3) thus provides an analytic continuation of {"(s, w) 
over the half-plane o- > 0. The function is actually analytic over the 
entire plane, except for the pole at s = 1, but this fact is not needed. 

Hereafter c will denote a positive constant which depends only on 
the arguments indicated; it need not have the same value in different 
occurrences, unless it has a subscript. 

Theorem 7-4. For ^ ^ o- < 2 and t > c{w)j 

lf('S w)! < 



For t > S and 


Proof: For ^ 

- 1 | > ^ > 1 . 

If(s, ^)l < 

< 


1 — (log 0“^ < <r < 2, 

If (s, wj)| < c{w) log t. 

< <7 < 2 and ^ > 3 we have |s| < 2 + t < 2t and 
Hence if we take m = [d + 1 (3), we have 


(1^1 + 


1 1 Ul+i 1 f"* du 

T + wy-^ + ^ ^jt 


1 


1 


21 


(Kl + 


+ c(w) + H "7 + "7^ ' 

1 + n = l 


or 


lf(s, WJ)| < 


J . 

([1] + 1 + 


m 

+ c(w) + Z) 

n =1 




Thus, for this same range of a and t, 


^ + 4Vt 


1 ^ ^ 

If n?i ^ 

< 2Vt + C(w) + r^+4Vt <8Vt + c{w), 

Jo wu 


and this is smaller than for ^ > c(w). 

Now take t > 8 > e‘. Then 1 - (log t) * > 2 . so 
1 — (log <)“* < <^ < 2, the inequality (4) gives 


that if 
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U) * 


lf(s, W))l < + c(w) + E 

n =1 

w 1 


+ 4i 


i/ioe ( 


< 2^6 + c{w) + e E - + 46 < c(w;) log t 

n*=l ^ 


Theorem 7-5, If , for \x\ < 1, 


«D 


/(a:) = E anx’^ 

n — 1 


ts regular and Re f(x) < 2 j I'hen |anl 1 for n 

Proof: Since |/(x)l < \l - /(^)1 for |x| < 1, the function 


fix) 


aix + 


1 - fix) 


1 — OiX — 


= aix + 62 ^^ + 


* A « 


• • 


is regular and has modulus at most 1 for \x ^1. But the function 

fix) 


fl (x) 


x(l - fix)) 


is also regular for |a;| < 1, and its value at a; — 0 is ajj by the maxi- 
mum-modulus principle, its absolute value is at least as large at some 
point on jx! = 1. Since for |xl = 1, 


|/i(x)| = 


fix) 


1 - fix) 


it follows that 

|ail < 1. (5) 

The theorem will therefore be proved if we show that each of the 
functions 

Fnix) = a„x + a2„x^ + * * * 

fulfills the same hypotheses as f(x) itself. This depends on the fact 
that if 7? = then 

n 

E ^ 

1=0 


n 


- D/iv'^ - 1) - 0 


if n\kj 
if n\k. 


n — 1 


n —1 «o 


00 


n — l 


E fiv^x) 
1=0 


E E = E akx'^ E V 

Z =0 *=1 *=1 i =0 


kl 


n E afcX*= = nFn ix" ) , 

n\k 


We have 
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so that F„(x) is regular for |x| < 1, and for such x, 

1 1 1 1 

Re F„(xn = - Z Re f(v^x) < - Z 5 ^ 

71^*0 ^ /=0 ^ ^ 

Theorem 7-6. R be positive, and suppose that 


OD 


fix) ^ Z an(a: - xq)'^ 

n *0 


IS regular and Re f{x) < M for \x — XqI ^ Then, for w > 1, 


a 


n 


< -^{M - Reoo). 

tl 


Proof: If Re ao = M, then a„ = 0 for n > 1, by the maximum 

modulus principle. 

If Re ao < M, put 

. , /(tq + Rx) - gp 
~ 2(M-Reao) 

Then g is regular for |xi < 1, 3(0) = 0, and 

Re /(xq + Rx) - Re ap 

YLegix) - 2(M - Re Op) - 2 (M-Reao) 

Hence g satisfies the hypotheses of Theorem 7-5, so that 

OnR^ 


Af — Re ao 


1 

2 


2{M — Re ao) 


< 1, 


and the theorem follows. 

Theorem 7-7. If f satisfies the hypotheses of Theorem 7-6, and 
Q < r < Ri then for \x — ^o| < r, 


|/(x)| < Kl + 


2r 


R -r 


■ {\M\ + Kl) 


and 


2 /? 

\fix)\ < 


Proof: We have 

\fix)\ < lool + Z I®'*!’'” ^ l®ol + 2(|M| + 

n —1 


(ft) 


= |aol + 


2r 


R - r 


{\M\ + |oo|), 
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and 




2(1A/1 + Wo\) 

R 




2R 

{R - r)^ 


i\M\ + Kl). 


Theorem 7-8. Let r be -positive and M real, and suppose that 

^ 0 and that, for \s — Sol < r, f{s) is regular and 


f(s) 

/(So) 



Suppose also that f{s) 0 in the semicircular region js Sol < r, 

Res > Re So- Then 

f' UI 

- Re - (so) < > 

•f r 


and if there is a zero p of f on the open line segment between So r/2 
and So, then 

f , ^ 4M 1 

- Re 7 (So) < ; -■ 

/ r So — P 


Proof: There is clearly no loss in generality in supposing that 
^ I So = 0. In this case, the hypotheses can be listed as 

follows. 

(1) For jsl < r, f{s) is regular and \f{s)\ < e , where M > 0. 

(2) m = 1. 

(3) f{s) 9 ^ Oforlsj < r, o- > 0. 

We look for an upper bound for — Re/'(0). 

If p runs through the zeros of / in the circle \s\ < r/2, then the 


function 



m 

n (1 — s/p) 


is regular for \s\ < r. 


On the circle \s\ = r, we have 



so that here \gis)\ < l/(s)| < e^. By the maximum-modulus 
principle, 
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b(s)l < 


for |s| < - 


Since g{s) 7 ^ 0 for [si < r/2, and ^(0) = 1, we can write 


9 is) = 


for ls| < - . 


¥ 


where G is regular and Re G{s) < M, G(0) — 
with r/2 instead of /?, 

, 2 4M 

1G'(0)1 < — M = 


0. By Theorem 7-6, 


But 


r/2 




g J p ^ 


s/p f 


p - s 


so that 


/'(O) + E; 

p p 


- ( 0 ) 

9 


= 1G'(0)| < 


4M 


1 4M 

-Re/'(0) < — + r Re-- 

r p P 

Since we have supposed that all zeros p have nonpositive real parts 
the theorem follows. 

If / is regular on the vertical line <xo + tij and if 

f f(s) ds = lim / /(<ro + dt 

J -a 


a— ♦ uao 

6 — ► 


6 — > ^ 


exists, then we abbreviate this limit to 



/(s) ds. 


M 


Theorem 7-9. We have 




2id 7 ( 2 ) S' 


ds = 


0 

log?/ 


for 0 < 2 / < 1| 
for 2 / > 1- 
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Proof: The integral converges, 
because 

O 

y~ , 

4 + ’ 



First suppose that 0 < ij < I* 
Then in the region bounded by 
Cl and C 2 (see Fig. 7-1), the in- 
tegrand is regular, so that by 
Cauchy’s theorem, 

K,ds + 






But along C 2 , which is of length tto, we have 


I 




so that 




7ra 



TT 

a 


Hence, as a 



00 







and the result follows. 

Now suppose that ?/ > 1, and that a > 2. Then the pole s = 0 
of the integrand lies in the region bounded by Ci and C 3 , and since 

y* I + slogy + {s^ log^ y)/2 + - - • 1 , \og y , 


we have by the residue theorem that 


27ri Jci s 




I 


I 


But along C 3 , which is of length 7 ra, we have 


y 


y 


2 < 


y 


(a — 2 ) 
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SO that, for a > 4, 


L (« - 2)^ 


< 


47ry' 


a 


Hence, as a qc 



y. 

C3 


ds 


0 , 



y 

Cl s' 


ds 



y 


(2) S 


dSf 


and the result follows. 

7-3 The Prime Number Theorem. It will be necessary in what 
follows to know something about the location of the zeros of the f- 
f unction. For ji] large, this information is supplied by Theorem 7-13; 
for 1«1 small, we use only the fact that f(s) does not vanish for <r > 1. 
Historically, this was the first nontrivial result obtained concerning 
the zeros of the f-function. (A trivial fact is that {-(s) 5^ 0 for a > 1, 
which follows immediately from the product representation 

f(s) = n (1 - p-‘r\ 

V 

valid for 0 - > 1-) The proof below that i*(l + ti) 5 ^ 0 is due to de la 
ValMe Poussin; it may have been suggested by the following consid- 
erations. 

For o- > 1 we have 

log f (s) = Z zn;;; = Z -, + /(«)- 


m,p mp-- p p 

and / is easily seen to be regular for <r > i ^Since f has a pole at 
s = 1, with residue 1, it follows that as <r 1 


Z 


P 


log 


O’ — 1 


( 6 ) 


We now reason heuristically. 
s = a + toij then as o- — > 1 » 

log lf(s)l 

and 


If 1 + hi is a zero of f, and we put 


log (o- — 1) 


Relogf(s) - Re/(s) = log lr(s)l - Re/(s) 


= Z 

P 


cos falog p) .. 1 ^^ _ 1 ). 


V 


Comparing this with (6) we see that for most p, cos {to log p) must 
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be close to - 1. But then cos (2Aj log p) must usually be nearly 1, and 


E 

P 


COS (2^0 log p) 


log 


1 


p 


(J 


- 1 


But this requires that f have a pole at 1 + 2/oU which is not the case. 
To make this argument rigorous, note that for all real B, 

3 + 4 cos B + cos 2B = 2(1 + cos 0)^ > 0. 

Hence for <r > 1, 

log + toi)t{o' + 2^of)l 

= 3 log lr(cr)l + 4 log lr((r + toi)\ + log + 2^o01 

1 cos (^ 0 ^ log p) , — cos {2tQn log p) 


= E 

n,p 

> 0 . 


n.p np 

3 + 4 cos (^ 0 ^ log p) + cos (2ion log p) 


np 


np 


nc 


Thus 


^ VO r(<r + tot) 


1 


If (o' + 2/of)l ^ — 1 ^ 


and if 1 + tot were a zero of f, the left side in this inequality would 
remain bounded as o I"*", while the right side increases without 

limit. 

We now use this technique, together with Theorem 7-8, to show that 
f(s) does not vanish at any point too close to the line o- = 1 and 
sufficiently far from the real axis. 

Theorem 7-10. For a > ly 

Re (^-3 ^ (a) - 4 ^ (<r + ti) - ^ (<T + > 0. 

Proof: Differentiating the relation 


logf(s) = £ 


ms 


we obtain 


m.p rnp 


f ' , _ ^ log p " 

V ^ .p'«« 

f m,p P n=l 


( 7 ) 
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A(n) = 


log p if n = p'", for any m > 0 and prime p, 


0 


otherwise. 


The termwise differentiation is justified because the series for log f (s) 
converges uniformly in any region to the right of o- = 1. Hence 


Ee 


y— 3 y (o-) — 4 y (o- + ti) — — (a + 


2ii) 


(3 + 4n-'* + n-2'*)A(n) 

= Ee E 

n = 1 


n 


00 


= E 

n = 1 

> 0. 


(3 + 4 cos (t log n) + cos (2t logn))A(n) 


n 


Theorem 7-11. (a) For a > ^ and t > c, we have |f(5)t < L 

(b) For t > 8 and a > I — (log t)~~\ we have |f (s)| < c log t. 

Proof: For O' > 2 and ^ ^ 8, 


\ns)\ ~2 

n = 1 n 


<2 < 


[log L 


For O' < 2, both inequalities of the theorem follow from Theorem 
7-4, 

Theorem 7-12. For<r > 1, 


00 


= E 


m(w) 


f(s) n=l n’ 

$ 

where m is the Mobius function. 

Proof: This follows immediately from Theorem 6-12 for s real 
and greater than 1; by analytic continuation, it is correct for o- > 1. 

Theorem 7-13. There, are constants ci > 8 and C2 > 0 such that 
f(s) 5^ 0/or 

C2 I 


t > Cl 


and 


<r > 1 “ 


log < < , ! 

Proof: In accordance with Theorem 7-11 (a), choose 

that ‘ 

k(s)\ <t, fortr > t> C 3 . 


, « Kf»' ■ 
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Inasmuch as 


3 ^ ^4 

4 log X 


< 1 - 


C2 


log X 


forx > 


it suffices to show that any zero 0 + yi of f with y sufficiently large 
(in particular, larger than 8) and for which 


3 , C4 

: 

4 log 7 


is such that 


d < 1 - 


Co 


log 7 


Put 




= 0-0(7) = 1 + 


£4 

log 7 


I 


and suppose that /3 + yi is a zero of f for which y > e 24 and 
0 > (70 - t- We shall apply Theorem 7-8, once with so = + yi 

and once with sq = (Tq + 2 yi. In either case, since cro > 1 , we have 
that for 7 > C3 + i the circle js - so| < I lies in the quadrant 
^ > C3. Since 7 > e'h we have ao < 2 , and, by Theorem 7 - 12 , 


1 


00 


*^0 


< 1 + 



00 


du 1 . 

^ — 1 

U 0 (TO 




2 2 
— = - log 7. 


C4 


Thn^ for each €i 


0 there is a C5 such that for 7 > ^5 > C3 + 2, the 




ns) 

r(so) 


Ur + i) 
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holds at every point s of the circular disk \s — so| ^ 2 j since at every 
such point, C 3 < i < 27 + ^. If 7 > C 5 , we can now apply Theorem 
7-8, with r = \, f(s) = f(s), M = (I + ei) log 7 . Using the first 
inequality of that theorem with So — <to + 2yij we obtain 


f' 

— Re y (ao + 2yi) < 8(1 + € 1 ) log 7 ; 
using the second with so = o-q + 7 ^* we have 

r' 1 

— Re — (< 7-0 + 7^) < 8(1 + Cl) log 7 

f a-o — p 


(9) 


( 10 ) 


since 




- r/2 


<^o 


^ 0 < I < <To. 


Finally, since cro — > 1 "^ as i , we have from ( 6 ) that for £2 > 0, 


f \ ^ ^1+^2 1 + C2 , 

- - M < 7 = — — log 7 

f O-Q — 1 C4 


( 11 ) 


for 7 > C 6 . Using the estimates (9), (10), and (11) in Theorem 
7-10 gives 

3(L±i2)j^ ^ + 4.8(1 +*,)log^ ^ + 8(1 + ti) log7 > 0 . 


Ca 


(To — 


This inequality can easily be simplified to 


(TO 0 > 


07 


log 7 


where 


4C4 


07 = 


3(1 + C2) + 40(1 + €i)C4 


and this gives 


^ < 1 - 


07 — ^4 

log 7 


It is clear that cy 
then take C 2 = ^ 


C4 if €i <3 and C4 is sufficiently small, and we can 
C4 and Cl = max (cs, cq). 


Theorem 7-14. If 0 < cs < C 2 , then 

|log f (s)| < log^ t for t > cg and cr > 1 — 


Cg 


log t 
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Proof : We use Theorem 7-7, with sq = 2 + <oh for some <o > 8 to 
be determined. For t sufficiently large, the circular region 


s — Sol ^ 1 + 


i(c2 + Cs) 
log to 


(12) 


lies entirely in the region described in the preceding theorem, in which 
r has no zeros. Hence the function log f (s) is regular in this disk, and 

by Theorem 7-11 (b), 

Re log f (s) = log If (s)l < log (c log t) 

< log (c log {to + 2)) < Cio log log to. 

Hence, by Theorem 7-7, we have that for s in the region (12), 

, 2 • 2(cio log log <0 + I f (so)|) 

llogf(s)l < lf(so)l + 1 

2 log <0 

< c + c log <0 log log to < log^ <0, 

if <0 is sufficiently large. This inequality holds on the radius extending 
toward the left from Sq, for every large to, and hence throughout a 
region < > cg, 1 - cg (log < 2. Finally, [f (s)| and ll/f(s)| 

are bounded in the half-plane cr > 2, and llogf(s)| is consequently 
smaller than log^ t for t large and o- > 2. 


Theorem 7-15. There is a constant a > 0 such that as x 

Zlog- = r ds + 0{xe-‘-^n, 

p<x P Jc S 

for some c with 0 < c < 1. 

Proof: Using Theorem 7-9, we have 

/ -5 log f (s) ds = — . I -o 12 ] ( - ) ds 

L) s^ ^ 2in J{ 2 ) n^i log ^ 


00 


27rt 


= J_ f A(n) 
27rt n =I log 



(x/ny , ^ A(n) X 1 , x 

= i: r^logT = Z 


( 2 ) 


n<xlog^ n 


nt,p m 

jr'*<z 


V 


= E log-d- E -log 


pS X 


V 


m,p m 

B»<X 


V 


m 
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Figure 7-3 


As noted earlier, the number of terms in the last sum is 

ir{x^) + 7r(a:^) H < -\ h X^>'‘ < UX*, 

where u is the smallest integer such that x^'" < 2. Thus 

Tr(x^) + ir(a:*) + ■ • • = 0(x^ log x), 

SO that 

2W S log - = f ^ log f (s) ds + 0(xe-'^), 

p<x P ® 

since 

V — log— < S log X = O {Vx \o^ x) = 0 {xe~''^) ■ 

m>2 yn p m>2 

p”*<x p"*<x 

We now cut the complex plane along the real axis, extending the 
cut from 5 = 1 to the left, and examine the function log f (s)jn the 
cut plane If 2 is the complex c onjugat e of z, then f(s) = f(s) and 
log 2 = i5il, so that log f (s) = log f (s). Hence, by Theorem 7-13, 
f (g) ^ 0 for |<| > C9 > Cl and <7 > 1 — Cg (log |<|) . Moreover, 
since f (s) does not vanish on the line <7 = 1, and since its zeros have no 
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finite limit point in any half-piano o- > to > 0 . (since f(s) is regular 
there), there is a constant Cu > 0 such that C(s) 0 m the rectangle 

1 — Cii < T < 1, 1^1 ^ 

Finally, r(«) ^ ©.for 1 < <t < 2, and the only singularity of the 
function in the half-plane a > 0 is at s = 1. Consequently, foi 
arbitrary u > Cg, log r(s) is a single-valued analytic function in the 

region Q shown in Fig. 7-3, bounded by the arcs Fi, F,, . . . , Fg, F 7 , 
Fe, , Fa, Fi- Hence if we denote by F the complete boundary of 

this region (so that we might write symbolically F = Fj -b Fa -b ■ ■ ; 
-b Fi), we have by Cauchy’s theorem that 



..S 


log f (s) d$ = 0. 


p s 


It follows that, if the integrals are taken in the positive direction. 


X 


8 


(2) S 


log Us) ds 




2 -ut 


2 — 

2 —u4 



+ 


ri 



ri 




2+ «»\ 

-b I )-^\og Us) ds 

■l-ut / S 

r2 4-«i 

+ 

+ r6 + r 7 + f6+ — I-P 2 J2+ui 


X 


S 


log Us) ds 


We shall show that all the other integrals are small in comparison 
with those along Fe and Fg, if u is sufficiently large. For brevity, put 


s 


4,{x, s) = -^log f(s)- 

s 

By Theorem 7-14, we have that for u > woCe), 


r2+«>t I r 

/ ^(x, s) ds < / 

J2 +u» 1 d2 


2-f«t- ^2 


<X^\ 

Ju 


" log2 t 


llogf(s)l|ds| 


t 


dt 


^ dt cx^ 


so that 


I ; 


♦ • 


^2+co* 

lim / V'(x, s) ds = 0. 


The same estimate applies to the integral from 2 — cot to 2 — ui 
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Since the length of r 2 is less than 2, and the integrand is again 
smaller than log^ u/u^ for u large, we have 


lim / xf^(xj s) ds = 0, 

u — ► cc Jr 2 


and similar considerations give 


lim / \(/(Xj s) ds = 0. 

u— * w 


Along Ts we have s = 1 — Cg (log t) ^ + tiy so that 



^{Xj s) ds 


Fa 


<^' ;i log^^ 

C9 ^ 



Cg 


t log^ t 




di 


Now suppose that Xj and then w, are chosen large enough that 


Cq 


< < u. 


Then 


I \f/(xy s) ds 
Jvi 


= 0 



V2c 8 log X 


C 9 


r 



= O 


(xe 




t 


yi 


= r 

cn\ogx 


‘ \o^ 

\/ 2 c 8 log X 

log^ t dt 


dt 


V2r« log X t 


= 0(xe 


— aVlOfi z 


), 


where a = ^ c%/2. 
By symmetry, 



s) ds = 0(a:e 


— avio, 


). 


r. 


The paths r 4 , Ts, and Ts are of fixed lengths, and on them 

4^(X, s) = 0(**-'n) = 


SO that the same estimate holds for the integrals themselves. 
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T^ is described by the relations s = 1 + 1^1 < tt, where 5 > 0. 

Since (s - l)f (s) ^ 1 as s 1, we have 

Re log r(s) = log If (s)i log Is - M = - log 5, 

Iinlogf(s) = argf(s) =0(1) 


as 6 — > O'^. Hence 



s) ds = 0[ 27r5 


X 


1+5 


Vi 


(1 - 5) 


log 5 ) = o{l) 


Combining all these results, we can take the limit as u 
6 — ^ 0"^ and obtain 


00 


and 


27 ri ^ log " = f s) ds + r s) ds + o{xe ® ), 

P<x V -/l— cu d\ 

where the first integral is along the upper edge, and the second along 
the lower edge, of the cut. We know that (1 s)f(s) R{s) is 
regular in the region c ^ 0, and that it has no zeros in the region 
(7 > 1 — Cii, !«! < cg. Hence the function 

log ((s — l)f(s)) = log (s — 1) + log f (s) 


is single-valued in this region; since log (s — 1) has, on the upper 
and lower edges of the cut, values which differ by 2iri, the same is 
true of logf(s), if the difference is taken in reverse order. Hence 
if s+ indicates the upper edge of the cut, and s~ the lower edge, then 


rl—cu 

/ yl/{x, s+) ds+ + ^{x, s~) ds 

Jl -CU 

= f 7 ^ log f (s+) ds+ - 

Jl -o„ (s^) 

r 2:* 

= 27ri / “5 as, 

Jl -cu S 



1 x^" 




(logfCs"*") — 27n) ds"*" 


and 


Z log- = f 
P<X V Jl- 


ds + 0(xe 


— Otv^og X 


). 


(13) 


^11 


The theorem is proved. 
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TT 


(x) 



du 


log u 




Proof: Replace 1 — Cn by C in (13), and put 


6 = 5(a;) = 

Then since log (1 + 6) ~ 5 as x — > oo, we have 
_ ^ x(14-5) X 

E log 

P P< X V 


P<x(l+5) 


x(l + 5) 

= E log (1 + 5) + - E log-^- — 

<i x<p<xa+5) V 


X 


= log (1 + 6)7r(x) + 0(log (1 + 5) * 5x) 



^ x" 


((1 + 5)* — l) ds + 0{xe 


— a\/log X 


), 


SO that 


7r(x) = 


1 


log (1 + 5) Jc 


-h — 1 / xe 

X* ds + 0(6x) + O 


—aV\o\t X 


Now 


(1 + 6)’ - 1 = s5 + 


s{s — 1) 

2! 


(1 + 


where 0 so that for 0 s <C 1, 


1(1 + 5)* - 1 - s^l < 


sis 1. < g2 


Thus, making the change of variable x‘ — u, we obtain 


£^* + 0(3-/' 

Jc 8^ Jc 8 \ Jc 




ds 


= S + o(s^ f X’ds) 

J^C log u \ Jc / 

= .r-^ 

J 2 log u 


+ 0(S^x). 
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Finally, 


TT 


(x) = 


da 


log (1 + 6) J-z log li 


+ 0(6x) + 0 


xe 


— av l«>g -c 


= (1 + 0 ( 6 )) 


du 



du 


log u 


+ 0(6x) + 0(xe-^“'"’^) 



+ 0(5x) = , . 

2 log U J 2 log u 



du 




The Prime Number Theorem is a very weak consequence of Theorem 
7-16, since 

^ du 



u 


log U log 


“ X 

“ + 
ltj2 


du 


X 


'2 log^ u log X 


and 


cVlog X ^ 

xe = 0 


X 


log X 


for every c > 0. In fact, we see by repeated integration by parts 
that the relation 


X . 2!x . 3!x . , mix , 

7r(x) = : h . — ^ h , ~ +*•■+, ™ o 


X 


log X log^ X log*^ X 


log^ X 


log"* X 


holds for every positive integer m. 

The coefficient a occurring in the remainder term in Theorem 7-16 
can easily be bounded explicitly; it can be shown for example that 
a = is an allowable value, by choosing C 2 = toVo) ^4 = 

^8 = == iV) «2 = T^* However, no result of this type is as 

good as the known result that 

^ du 


ir(x) 



log u 


+ 0(xe 


— cVlog X log log I 


). 


In a variant of the proof given here, the factor logf(s) in the 
integrand is replaced by f'(s)/f(5). The logarithmic singularity at 
s = 1 is then replaced by a simple pole, which makes the analysis 
somewhat less complicated. On the other hand this gives an esti- 
mate not of 

- El, 

P<x 

Hx) = Eiogpri^i. 

p< X Li*^s p J 

and an additional step is needed to obtain the final result. 


but of 
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7-4 Extension to primes in an arithmetic progression. For rela- 
tively prime integers k and Z, let Tr{x]k, 1) be the number of primes 
p = l (mod A:) which do not exceed X. For given fc, there are ^ (A:) = h 
choices of I which are distinct modulo k, so that if the primes are more 
or less evenly dispersed among the various progressions, it is to be 

expected that 


ir{z; ky 1) 


1 X 
h log X 


It is the object of this section to show that this is the case, and in fact 
to obtain an estimate for 7r(x; A:, 1) similar to that given in Theorem 
7-16 for x(x) = ir(x; 1, 0). Several proofs will not be given in full 
detail since they are similar to those of the preceding sections. As in 
the preceding chapter, we isolate the primes lying in a given arith- 
metic progression by use of characters and L-functions. The L-func- 
tions are in turn simple combinations of Hurwitz f-functions, as the 

following theorem shows. 

Theorem 7-17. For a > 1, 


Proof: Since x is periodic of period k, 

“ x(w) 

L(s, x)=E 


_i n 


n — 1 


00 


(km + aY 

The domain of validity of (14) can be extended somewhat. If we 


put 


^^(x) = ' 


1 

0 


if X = Xo, 
if X 5*^ Xo, 


then the first equation of Theorem 6-6 becomes 

i x(a) = E(x)h. 

a 
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Hence, for <r > 1, 

E{x)h 


LiSy x) — 


k 


s — 1 


S - l\k^ kj k^ a=i [ \ k 


1 


s — 1 


By Theorem 7-3, each summand on the right is regular for a > 0, and 


- 1 \k^ k) 


k 


1—8 


- 1 


k{s - 1) 


IS an 


integral fimction. By analytic continuation, we have 


Theorem 7-18. The relation (14) holds for o- > 0, except ai s — 1. 
Moreover^ 

(15) 


lim (s — l)L(St x) — 
#—►1 


hE{x) 


k 


Hence L(s, x) is regular for cr > 0, except that L(s, xo) has a simple 
pole at s = I- 


For <r > 2 and t > 8, 


iLfs, x)l < H “7 


^ c\ ^ 


t 


while for <r > 0 and i > 0, 

lL(s, x)l < 

so that Theorem 7-4 yields 

Theorem7-19. (a) For <T>\andt> Cx 2 ik'),wehave\L{,s,x)\ <t- 
(b) For < > 8 and a > 1 ~ (log O'S w'® have lL(s, x)l < 

CisCfc) log<- 

The proof of the nonvanishing of {"(s) on o = 1 can be generalized 
in a simple way. 

Theorem 7-20. L(s, x) does not vanish on the line <r = 1. 

Proof: For (T > 1, 

L(s, x) = n (1 - x(/>)p~*)“'. 
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SO that we can choose 


logL(s, x) = Z 


X(P'") 


Hence 


m,p 


mp 


ms 


log \L^{<t, xo)L*{<j + ti, x)L{<t + x^)| 

= 3 log \L{(t, xo)| + 4 log |Z/(tr + ti, x)| + log \L{<i + 2<i, x^)| 

= 3 log L((r, xo) + 4 Re log L{<j + ti, x) + Re log L{(t + 2ti, x^) 

_ , R (?>”•) \ 

3 + 4 cos ivip”') — t log p”*) + cos 2(»?(p”*) — t log p”*) 


= E + Re 

^ \ mp’"" wp 


4x(p") 


= E 


m,p 

p-f-A; 


mp 




> 0 , 

where x(p^) — Thus 


((a- — l)L((r, Xo)) 


L(<x + ti, x) 


<T 


- 1 


^ 2 1 


\L(<r+2ti, x")l > 


- 1 


and the falsity of the theorem would contradict Theorem 7-18. 

By now the proof of the following analog of Theorem 7-10 should 

be a simple exercise for the reader. 

Theorem 7-21. For o- > 1, 

— 3 — (o', xo) — 4 Re ^ (o- + ti, x) — Re — (o- + 2ii, x^) > 0. 

L h ^ 


Theorem 7-13 becomes 

r 

Theorem 7-22. There is a ci(k) > 8 such that L(s, x) ^ ^ 
t > Cl (k) and o- > 1 — C 2 /log t. 

The only difference in the proofs is that now the first inequality of 
Theorem 7-8 is applied with/(s) = L(s, x^) and so = (r + 2ii, while 
the second is applied with/(5) — L(s, ^) and 5o = o' + ti. A 
constants now depend on A:. After these trivial mod' .lons, tn<.||. 

proofs are identical. r y 

Similarly, replacing Us) hy L(s, x) throughout, Theorem 7-14 

becomes 
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Theorem 7-23. For t > c^ik) >8 and a > 1 - Cs (\ogt)~^, 
[log Lis, x)l < log^ t. 

The constant cgCfc) may be different from the cg of Theorem 7-14; 
the subscript is retained to facilitate reference to Fig. 7-3. In the 
same way, Cn becomes Cnik). 

Instead of proceeding directly to the analog of Theorem 7-15, it is 
convenient to break the argument into two steps. 


Theorem 7-24'. For (k, 1) = I, we have 


X \ 1 

Z log- = ;r-7Z 

p<x 

pal (mod k) 


X 


p 27rih X X (0 7(2)5^ 


logL(5, x) ds + 0(Vz\og^ x). 


Proof: Using Theorem 7—9 and the series expansion for log L{Sy x)» 
we obtain 


X 


$ 


. x) ds 

2Tn 7(2) 









2 ?!^ yn,p 



= L 


x(p"*) log (x/p^) 


m 


m.p 

X , _ X (?>"") log (a:/p'”) 

Z x(p)log- + Z 

P<X V V”*<x 


m 


= Z x(p) log - + 0{Vx log2 x). 


p<x 

p^k 


V 


Multiplying by l/x(U summing over all characters modulo we 
deduce with the help of Theorem 6-7 that 


1 


X 


Z Z xiv) log - = ^ 

X x(.0 p<x P 


X 

74* 


“’S 

pml (mod *) 




^ — E 

2m . 



X 


s 


log L(s,x) ds + 0(V^log ^x), 


( 2 ) 


which is the theorem. (Here and throughout the remainder of this 
section, the implied constant in the 0-symbol may depend on k.) 



256 


THE PRIME NUMBER THEOREM 


[chap. 7 


To estimate the integrals appearing in Theorem 7-24, we must 
distinguish two cases. First consider the case x = Xo- Every prop- 
erty of the integrand which was used in estimating 



X 


( 2 ) 5 


logf(s) ds 


carries over to the integrand of 



(2) s 


logL(5, Xo) 


It follows that for suitable c with 0 < c < 1, 



.S 


^ log L(s, xo) ds = 2^ [ ^ ds + 

Jc 5 


(2) S 


On the other hand, if x 5^ xo, then L{s, x) has no pole at s = 1, but 
the other properties used earlier still obtain. Hence, if we do not 
cut the plane, but consider the line segments Ts and Ts in Fig. 7-3 
as a single segment Tg, and omit Te, r 7 , and Te, then the function 

x) = “2 log Z/(s, x) 

s 

is regular in the region bounded by Ti, T 2 , Fa, r4, Fg, r4, Fa, Fa, Fi, 

so that 

f ^{s,x)ds=(l -f +/ . )iA(s,x)ds. 

J(^) \ J2 — 00$ •/ f 2 +I' 3 +P4 +r8+r4 +r3+r2 •/2+ti$ / 

Moreover, the integral along each of these new arcs either tends to 
zero or is 

It follows that 

E log- = i r^ + 0(xe-“'^), (16) 

p<z p ^ Jc sr 

pml (mod k) 

which is the analog of Theorem 7-15. In exactly the same way as 
Theorem 7-16 was deduced from Theorem 7-15, equation (16) leads 
to the desired result: 

Theorem 7-25. If k is a fixed integer and { k , 0 = then^ as 

a: — > 00 , 


IT 


L 




). 
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As consequences of Theorem 7—25, we have that 


7r(x; ky 1) 


1 


<p{k) logx 


and that, if (/c, l\) = (k, J 2 ) — then 



7r(x; ky h) 
7r(x; ky I 2 ) 



) 


so that asymptotically there are equally many primes in the progres- 
sions kt + li and kt -h ( 2 - 

A serious drawback of Theorem 7-25 is that the error term is not 
uniform in k. This precludes applying this version of the theorem 
to problems in which k increases with x, and these unfortunately 
are among the most important applications of this kind of theorem. 
It is known that the error term in Theorem 7-25 is uniform m k for 
k < log^ X for some m > 0, in other words, that the relation dis- 
played in the theorem can be used if k increases sufficiently slowly 
with X. The proof of the more general theorem, while similar to that 
given here, is more complicated. The chief difficulty is this: when 
dealing with fixed fc, it is enough to prove that L(s, x) ^ ^ 
s = 1 in order to deduce that for some Cn (A:), L(s, x) 5*^ 0 for 
1 - Cn < (T < 1, 1^1 < CQ{k). When k increases, however, Cn might 
tend to zero quite rapidly as a function of ky in which case the integral 
along Fg would not be negligible. It is therefore necessary to investi- 
gate further the zeros of the Z/-functions near the line a = 1 for 
small \t\. 


7-5 The integers representable as a sum of two squares. As a 
final illustration of the methods of this chapter, we shall obtain an 
asymptotic estimate for B{x)y the number of integers not exceeding 
X which can be written as a sum of two squares. The integers counted 
are exactly those in whose prime-power factorization the primes 
r = 3 (mod 4) occur only to even powers.* The following heuristic 
argument indicates that it is to be expected that B{x) is of the order 

of magnitude of x/ Vlog^, which is in agreement with the result to 
be obtained. 

Take x very large. Since one out of every p integers is divisible 
by p, the number of integers up to x not divisible by p is about 


* Cf. Volume I, Theorem 7-3. 
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x(l — 1/p)- Hence the number not divisible by any p < y/x is 
roughly 

rc n - -V 

'•Jx \ 'P / 

• • 

1 

so that, by the Prime Number Theorem, 



where the syii.bol means “is probably of the order of magni- 

tude of.” To count the integers contributing to B[x)j we do not 
want to eliminate all composite numbers, but only those divisible 
by an odd power of any of the various primes r = 3 (mod 4). As in 
the cross-classification principle,* we can omit all those divisible 
by r, then reintroduce those divisible by then take out those 

divisible by r^ etc., giving 



* 

as the number left after accounting for the one prime r. (The prod 
uct has only finitely many factorij. ) Hence, 

B{x) « X i]J(i - 0 n _(l + - 

r<V^. ' ' ' V* ' ' ' 


and since each product after the first converges as z 
write simply 


00 , we can 




X n 

r< y/x 


Now 




p< y/x 


(-9 


(i --)= z iog(i - A 

\ p) v<Vx \ V/ 


log X, 


r — , 

and since, by the results of 'the preceding SecUun, about naif the 
p’s are r’s, we have ' j 


log 


n (i - 

r< \ rj 


- - log log X 


= — log 


X, 


• Cf. Volume I, Theorem 6-4. 
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SO that 


B{x) 


Vlog 


Probably the most that can be said for this argument is that after 
seeing it, the reader should not be very surprised to learn that, for 

some b > Oj 

B{x) (1’^) 

A/lrinr 


Nevertheless, it is just this type of reasoning which underlies the 
proof of (17) which will now be developed. 

If we put 

[l if n = + 2/^ for some integers x, y, 

^ \0 otherwise, 


then 

B{x) = Y. 

n< X 

For (7 > 1 let 

* h 

f(s) = E ; 

the series converges absolWly in this domain, and uniformly in any 

closed bounded region to the right ( theline(r=l. Using g and r to 

denote primes congruent to 1 and b (mod 1) respectively, we deduce 
from the definition of bn and Theorem 6-3 that, for a > 1, 

■.)n(i + t + p + ' ■ 

= ch n (1 - n (1 - . 

Q r 

As was formula vi5) of the preceding chapter, 

r(s)L(s) = (1 - 2-^)-^ n (1 - n (1 - r-^"r\ (IS) 

Q r 

where L(s) = L(s, x) is the L-function for the nonprincipal character 
(mod 4), 

, |0 if 2ln, 

x(^) - |(_l)i(n-i) if2K 
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(The relation (18) was proved earlier only for 5 > 1 and real; the 
extension to the half-plane cr > 1 is immediate.) Hence, for a > 1, 

f(s) = (1 - 2-»ri n (1 - r-2»)-if(s)L(s). (19) 

r 


Since L is regular for (7 > 0 and 



the function fL has a simple pole at s = 1 , with residue Tr/i, but is 
otherwise regular for cr > 0. Moreover, neither f(s) nor L(s) is 
zero for s in the region Q of Fig. 7-3, for suitable positive cn and cg. 
Since the functions 

(1 _ 2 -*)-‘ and n (1 - r-=**)-* ( 20 ) 

r 

are regular and different from zero for cr > ^, and bounded m abso- 
lute value for <r > <ro > i we deduce the following properties of / 

from known properties of f and L. 

Theorem 7-26. (a) f{s) is regular and different from zero tn 

the region Q of Fig. 7-3, for suitable Cu and Cg, and it has a simple 

pole at s = 1 , uaith residue 


In a- = b^. 

Zi r 


( 21 ) 


Hence f is also regular in Q, and /^(s) • (s - 1) is reguUr tn the 
uncut region Q' formed from Q by omitting Fe, and Tg, and 

joining Tq and fs. . w i t^i i. 7 j 

(b) For \t\ > 8 and sin Q, the inequality |/(s)| < C 14 log |^| holds 

{cf. Theorem 7-11 (b)). 

From this follows the usual consequence. 

Theorem 7—27. For suitable c < 1, 


E log ^ 

n<x ^ 




f ^/(s)ds-f 0(xe-“'^ 

Je ^ 


Proof: The proof fcJlows the lines of that of Theorem 7-15 as 
regards changing the path of integration in the relation 

^ 6„log; = [ 'kf(.s)ds 

n<x ^ 7(2) « 

and eatimating the new integrals along those paths which are 
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bounded away from s = 1 ; the only change is that 
|/(s)l < fi 4 logl^l is used rather than llog f(s)l < log 
the tiresome details, we arrive at the relation 
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the estimate 
Omitting 



^ X I 

^ bn log - = 

n=l ^ 27ri 



f6 



Ft 



f 

2 




In the neighborhood of s = 1, /(«) has the expansion 


fis) = 


\/s — 1 


+ 


with basin (21). Here > Ofors > 1. Puttings = 1 + 5c•^ 

we have 

^/(s) ds = o(— rrr; ■ ■ 2Tr& ] = o(l) 



r, s 


(1 - if VI 


as 6 — ^ 0- , • r/ \ I 

Since /2(s)(s - D is single-valued in Q', the quantity 2 arg/(s) 4- 

arg (s - 1 ) is unchanged by traversing a path in Q from Te to Pe- 

Since arg (s - 1) increases by 27r, arg/(s) decreases by tt, so that/(s) 

has opposite signs on the two edges of the cut. Hence 

f 5j/(s) ds + f ^/(s) ds = 2j ^/(s) ds, 

Jr, s Jf, s -'Ps s 

i K log V) * + 

„=1 n TTl Jl-ou S 

The proof is complete. 


Theorem 7-28. 



00 




where 



} 


Proof : On Tq we have 

fis) 


hi 


Vl - s (1 - (1 - s)) 


-f 0(V1 - s) 


hi 


Vl - 


+ 0('\/l — s) 



262 


■ 0 

THE PRIME NUMBER THEOREM 


[chap. 7 


as s — > 1 , so that 


E bn log - 

r. = l ^ 


-f 


cii 




b / 

du u\ I X 

IT Jo \J0 


‘-“M-i du + 0 



Cll 


l—u 


dv^ 


+ 0 


X 





Cll 


-u log x^-i + 0 



Cll 


>/• 



log^ X 
+ 0 


X 



TT ^0 


+ 0 


X 


log^ X 



1 ’ SOI 


bx 


TrVlog X Jo 


+ e-V*)’+o(^) 


7r\/log X \ Jculogx 


e-^v-i dv] + 0 


Vlog 
bx 


(s?-x) 


?l={Vx + o(/'” 

Inor 3! I \Jcii log X 


TTVlog a: 



+ 0 


X 


log 



Thus 


* X .Bx , ^ 

E 6„log- = — = + 0 


n 


n 


Vlog X 


(logi x) 


where 


B = -^ = 4=n(i- 

VTT V 2 f 


Now let 5 = 5(x) be positive. Then 

x+sx ^ K ^ ^ 

E bn log — 

n n=l 

n *1 


* x+a* X 8x 

= log (1 + s) E b„ + Z bn log - 


n »1 


n 


n 


= log (1 + 6)B{x) + O (log (! + «)• «x), 
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Bx (1 -h 5) 


Bx 


Bx 


Vlog (x + 6x) Vlog X Vlog X \Vlog (x + 6x)/log x 


1 + 5 


- 1 


Bx 


1 + 5 


\^log X \ V 1 + log (1 + ^)/log X 


Bx 


1 + 5 


v^logx Vl + 0{d, logx) 


- 1 


Bx 

V log X 
Bx 

Vlog X 


5 + 0(8 logx) 
I + 0(6/ logx) 


(5 + 0(6/logx)). 


f 5) = 5 + 0(6^) as 6 


B{x) = 


/ " + 0 + 0(6x) + 0 

Vlogx llog (1 + 5) Vlogx/I 


5 logs .r 


Bx 


Vlog 


(1 + 0(6)) + 0 (jV) + 


5 logs X 


Choosing 6(x) = log * x, we obtain 

B{x) = -p= + oC 

Vlogx V 


Vlog" 


+ 0 


logs X 


+ 0 


logi j 


+ 0 


log^ X 


Bx 


Vlog X \log* x/ 


and the proof is complete. 
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